Generative AI

A generative AI learning pathway


Learning generative AI is not the first step of learning generative AI. Using LLMs does not teach generative AI either. Most data scientists build LLM-based applications. Such applications leverage the power of LLMs in interpreting and generating human natural language. This makes sense because the complete set of use cases of generative AI to improve our lives is not yet understood. Hence, “using” LLMs is a science.

The other science is “building” LLMs. One may argue that models are already created by taking such a humongous amount of data. Why not use that? Why reinvent the wheel? That sounds about right. The only problem is that it is not very progressive thinking. We are assuming that the OpenAIs and Anthropics of the world have already built the best LLMs. Moreover, the best way to use something is to know how it is built. Here is a generative AI learning pathway I would recommend.
First, statistics for ML. That would cover understanding descriptive and inferential statistics, samples and population, distributions and the central limit theorem. You will get a flavour of loss functions as well. Building the best ML model is all about minimising the loss function. During the training process, LLMs utilise cross entropy as the loss function. Cross entropy is the difference between the predicted and actual probability distribution of words. Don’t forget the section on probabilities while going through statistics for ML.

The second is data exploration. You must familiarise yourself with summarising, exploring, and investigating datasets. Interact with databases and flat files. Create visualisations to convey your interpretation of the data. You should be able to do this for structured and unstructured (textual) data. This will take you into the realm of natural language processing (NLP). In a sense, generative AI is NLP.
The third is ML modelling techniques. I cannot stress enough how important it is for a data scientist to understand why a supervised model works the way it does. Learning ML is not building models without delving into what happens under the hood. You must appreciate how the model “approximates” the dataset provided during the training process. We summarise a dataset by determining the minimum, maximum, standard deviation, average, and quartile values of numeric variables. We mention the frequencies of occurrence of values of a categorical variable. There are no two sets of values in that summary. By a little stretch of concepts, I can say an ML model is also a “summary” of the dataset. But an approximate one. Hence, it predicts for unseen data.

The fourth is deep learning. In other words, they are artificial neural networks (ANNs). A specific type of ANN called recurrent neural networks (RNNs) has been widely used for natural language interpretation and generation. We took a magical leap from RNNs to transformers when we replaced sequential text processing with parallel text processing. Hence, our very own transformer is also an ANN. Transformers are building blocks of LLMs. LLMs are used to generate content. Therefore, we have the term “generative AI”. I could as well call them “LLM AI”. If I did, you would bring up SLM AI for Small Language Models and VLM AI for Vision Language Models. I get your point. Let’s cover all these language models with an umbrella and name them generative AI.

The fifth is the technology ecosystem being created around LLMs. LangChain is a framework that makes interacting with LLMs easier. LM Studio and Ollama are tools that help us download foundational LLMs and fine-tune them with additional relevant data. RAG (retrieval augmented generation) is usually used as an alternative to fine-tuning. RAG brings context into LLM’s understanding and response without requiring fine-tuning. However, nobody stops us from using both simultaneously to make our LLMs super contextual.

Before you go, one last thing. If you have put in so much effort working on complex technical stuff, take the final step of explaining your generative AI solution to the stakeholders (especially business users). Storytelling must take place at different levels of granularity. There is no one-story-fits-all. While the generative AI solution being explained is the same, you must have different details for your data science team, managers, senior executives, customers, and business users.

I walk the pathway every day. When I get up, I step out, thinking I can zoom past the familiar path I have covered multiple times. However, I see something new in store for me along the way. I realise that keeping my eyes, ears, and mind open to learning is the best way to navigate such paths. If it means you have to ignore “gyan” from seemingly knowledgeable data scientists, you must.



Linkedin


Disclaimer

Views expressed above are the author’s own.



END OF ARTICLE





Source

Related Articles

Back to top button