--- aliases: Gen AI, ChatGPT --- ### Infographic ![[Generative AI - ByteByteGo.png]] ### Key Terms - Generative AI - AI tool that can receive user prompt and generate plausible output based on a large language model that has been trained on data sourced from internet and other sources. - Training data - Mainly sourced from available, public resources. English language is most prominent cause internet is written mostly in English, but other language data is available from other sources too. Wikipedia, Reddit are examples of data sources. - There's a cutoff for training data based on the timestamp used to scrape internet, so any later events information may not be available. - Large language model - LLM foundation model - Language model built based on generative pretrained transformer technique using training data (scraped from internet, publicly available data) - Next token/phrase prediction calculated using matrix multiplication techniques using numerical tokens that are representation of parts of text from a given sentence in the training data - As training data and model features/parameters grow, model starts outputting human readable and logical next text phrase to make it an AI chatbot style experience interactive - Plausible output - Based on training data, there is probability distribution generated over a given input, so that highest probable text is outputted - Prompt - input data given by user - Hallucination - outputting something based on plausible outcomes but is factually incorrect, e.g., generating image of 2 hands totaling 10-fingers where one hand has 4 fingers and other has 6 fingers - Evals or Evaluations - Measuring performance and quality of LLM model to ensure safety, efficacy and goal alignment. See [[Generative AI#Evals|Evals section]] on this page - - Tokenization - Raw text into tokens into integers ![[Tokenization.png|1000]] - Multi-modal - - LLM that supports multiple types of data - text, audio, image, video. - Initial models were based on text data, and later other 'modalities' were added, for example - images - tokens generated using pixel data to generate/predict new pixels - Embeddings = Low-dimensional, numerical vector representation of complex data types [Embeddings explained](https://www.linkedin.com/pulse/embeddings-explained-plain-english-subramanian-subbu-iyer?utm_source=share&utm_medium=member_ios&utm_campaign=share_via) - Embeddings create a vector of numbers for given data, typically done using neural network machine learning model by learning to optimally represent each object - The vectors are then checked with other vectors to find similarities and closeness - Embeddings are used to understand closeness of different words (e.g., "king" and "queen" may be closer than "king" and "crocodile") - Use cases - - NLP - Recommendation systems - Computer vision - Search engines - semantic search uses embeddings to uncover "meaning" or "intent" of search query instead of simple keyword match - Dimensions = Attributes that explain relationship between data elements, (E.g., movie can have dimensions like genre, emotion, color) ### Model creation steps ##### Core technical steps 1. GPT - Foundation model using GPT approach - Generative Pretrained Transformer. Using internet data stream to train a model 2. SFT - Supervised Fine Tuning - Labelled data, improving model response for specific or nuanced topics 3. RLHF - Reinforcement learning with human feedback - Goal based self-learning, thinking strategies that discovers new ideas to answer questions by trying out multiple approaches, and then those answers are rated using a reward model and human feedback 4. Internet search data can be integrated to make sure model has access to latest data Taking steps 1, 2, 3 creates a GPT version. ##### Gen AI Development Lifecycle 1. Scoping and defining use case - 1. Understand problem space and opportunity 2. Feasibility (technical, strategic, and financial feasibility) of Gen AI approach 3. Establish clear success metrics 2. Model selection and customization 1. Select pre-trained foundation model 2. Assess customization options - prompt engineering, RAG techniques, fine tuning using domain specific or proprietary dataset (e.g., JP Morgan using internal financial data to fine-tune a model) 3. Data preparation - create labeled data 1. RAG technique requires creating embedding of knowledge base as a vector database or embeddings 2. Fine tuning requires creating high quality labeled data of proprietary information that is clean, free of bias and representative of the use case at hand 4. Development and Evals 1. Dev may involve approaches around how Gen AI application will behave, chain of commands, prompt flows or integrations with other tools 2. Evals include evaluations that are of multiple types 1. Qualitative - humans evaluating gen AI responses to assess hallucination, quality, safety 2. Quantitative - to assess correctness, grammar, 3. Another LLM as a judge - to do Evals at scale 5. Deployment, monitoring and iterations 1. Deploy model to production and create user facing application 2. Monitor performance and gen AI specific metrics 1. Traditional metrics around availability, errors, latency 2. Gen AI specific metrics around "model drift" where model performance degrades over time 3. Iterations - bring improvement in model based on prompt updates, RAG document knowledge base upgrades, or redoing fine-tuning to better align to goals ### Use cases ##### Text only 1. Chatbot style question and answer - e.g., build an itinerary to visit Grand Canyon and Las Vegas, or help write an email message for a given input prompt 2. Deep research - Approach which mimics humans in answering a question thoroughly by first building step by step approach, then adding information, creating draft and revising it to create a final answer. 3. File upload - use user text prompt plus PDF, Word, Excel file information to add context to the prompt and output relevant answer 4. RAG - Retrieval Augmented Generation - Finding relevant information or document from a separate knowledge base and provide it to LLM as context on top of user query to answer user's query 5. Code interpreter - For a given prompt and input code, explain the code, understand errors or suggest improvements 6. Data analysis - Perform calculations on top of the data from user prompt 7. Create charts, graphs, diagrams, plots - Use the data from user prompt to build visuals 8. Summarization - For given text, summarize the concept by picking main themes and present in reduced word count 9. Memorization - remember important facts, notes about user or knowledge of a topic ##### Multimodal - Text, Audio, Image, Video 1. Conversational AI - e.g., automated customer support, sales support agent 2. Summarizing text into audio podcast - NotebookLM 3. Optical character recognition - OCR - e.g., scanning nutrition label to determine if the product is healthy or not 4. Image generation - DALL-E - text to image 5. Video analysis - Meta RayBan video stream used by blind person to understand what is in front of them. Used by "Be My Eyes" app 6. Video summarization - YouTube videos summarized into a text paragraph 7. Video generation - Google Veo - text to video 8. Audio generation - create music or audio or podcast based on user input data (text or audio or video) 9. Learning a new topic or language - e.g., learning Korean or learning piano keyboard that has topic specific nuances ### Evals Definition: Systematic method of evaluating performance, quality, health of LLM application Benefits 1. Measuring quality of model - check if fine-tuning is improving model performance, accuracy, helpfulness 2. Aligning with goals - to ensure if model aligns with business or product goal. E.g., chatbot used for customer care should be 'polite' and 'helpful' and 'not hallucinate' 3. Identify biases, loopholes, security vulnerabilities - to ensure we minimize the misuse, tricking model to create harmful outputs, preventing leakage of sensitive or proprietary information and negative branding 4. Comparing model performances - E.g., ChatGPT 3.5 vs Llama 3 - assessing performance to understand which model excels in what area ### AI Agents https://www.linkedin.com/posts/pawel-huryn_agents-are-the-most-valuable-skill-in-ai-activity-7339343474720178176-EiiQ ![[Building AI Agent.jpeg]] ### Webinar - Quick intro to Gen AI [Introduction to Generative AI (youtube.com)](https://www.youtube.com/watch?v=N3sMj-sBcLg) - AI - Artificial Intelligence - broader reaching field of using computers and machines to build intelligence that matches or surpasses human capabilities - ML - Machine Learning - Subset of AI. Uses data to learn (without explicit programming) to calculate complex mathematical relations between data to provide an output value for given set of inputs - Supervised ML - uses labeled data to draw correlations or mathematical patterns - Unsupervised ML - uses unlabeled data to surface up inherent patterns - Deep Learning - Subset of ML. Uses multiple layers of artificial neural networks similar to brain neural network so that it can process substantially complex tasks like identifying winning moves for AlphaGo game. It can process both labeled and unlabeled data - Discriminative AI - typically used to classify or predict value (probability) by learning relationship between features/attributes and labeled data. E.g. classify if an image is a dog or a cat - Generative AI - typically used to generate new data given unstructured/unlabeled training data and a prompt/input example. E.g. generate image of a dog - Large Language Models - Used to predict next best letter/word/phrase ![[Identifying GenAI use.png|200]] Insights 1. Gen AI and LLM makes supervised learning faster. 2. Bigger value gain is still in supervised learning use cases. - Andrew Ng [[DCQO Gen AI]] New findings: RLHF vs DPO ![[Pasted image 20240114171033.png]] ### Generative AI in Enterprise: "Generative AI in Enterprise" organized on Saturday, July 1st, 2023: https://iitghy.webex.com/recordingservice/sites/iitghy/recording/playback/680f2c96fa4f103ba7f8529fa58baa85 Passcode: 3qPEMJMW [5 Ways To Use AI in Ecommerce | Salesforce](https://www.salesforce.com/blog/ai-in-ecommerce/) [Marketing and sales soar with generative AI | McKinsey](https://www.mckinsey.com/capabilities/growth-marketing-and-sales/our-insights/ai-powered-marketing-and-sales-reach-new-heights-with-generative-ai) ![[Pasted image 20230816165324.png|600]] ![[Pasted image 20230816165401.png|800]] ![[Pasted image 20230816165723.png|800]] ![[Pasted image 20230816165754.png|800]] ### 2023 Microsoft Build - Andrej Carpathy ![[Pasted image 20230619142819.png|1000]] ### Gen AI Landscape - March 2023 ![[Pasted image 20230823194355.jpg]] ### Dell ![[Gen AI - big bets.png]] ![[Pasted image 20230926084346.png]] ![[ChatGPT for kids.png]]