We know that AI models are changing at breakneck speed as this emerging technology continues to develop. Developers have a crucial role to play in organisations to help them navigate this new landscape – but how do they navigate around this new realm when so many of the tools and skills appear to be still nascent and so fluid?
Shift to domain-based AI
In the evolving landscape of Artificial Intelligence (AI), we are witnessing a significant paradigm shift from data-centric Machine Learning (ML) to domain-centric approaches. This is the opinion of Ramprakash Ramamoorthy in his position as head of AI research at Zoho This transformation is underpinned by the emergence of foundational models – large-scale neural networks that are trained on broad, diverse datasets. Unlike traditional models, these neural networks possess the ability to handle multiple tasks without being explicitly trained for each one.
“Consider the earlier approaches in Machine Learning, particularly in data-centric models. In such a setup, we often had to develop multiple distinct models for similar tasks across different contexts. For instance, separate models were needed for sentiment analysis in chat-based customer support and email-based customer support. Each of these models required specific training on data relevant to its particular medium of communication,” said Ramamoorthy.
However says Ramamoorthy, as we transition to task-centric, or more aptly, domain-centric Machine Learning, the scenario changes drastically. Here, we rely on one broad foundational model that is fine-tuned for a specific domain, such as customer support. This approach not only simplifies the process but also enhances the effectiveness of the model across various applications within the domain. We now foresee a future where foundational models are fine-tuned for specific sectors like customer support, finance, marketing and so on.
Accessible & customisable AI
The Zoho team has worked through a lot of this evolution and notes that the advent of these foundational models is pivotal as it eliminates the data bottleneck that has long been a challenge in developing bespoke AI solutions. This reduction in the need for extensive and specific datasets is argued to lower the entry barrier for many, making AI more accessible and customisable. For developers, this shift presents an opportunity to deepen their skills in fine-tuning these foundation models for specific domains. They can now focus more on adapting and refining the AI to meet the unique needs and nuances of their field, rather than building models from scratch.
“However, even with all this said, it’s crucial to acknowledge that foundational models, once trained, might not be fully attuned to concept drift – the phenomenon where new data emerges post-training that wasn’t included in the original dataset,” heeds Ramamoorthy. “This is where techniques like Retrieval Augmented Generation (RAG) come into play. RAG acts as a connector between the foundation model and contemporary data, allowing the model to pull in specialised, external data to augment its base knowledge. By integrating such techniques, developers can ensure that their AI solutions remain dynamic and relevant, continually adapting to new information and evolving trends in their domain.”
Zero-shot & few-shot learning
In the world of AI, the arrival of larger models has implications for what in some areas are still prototyping emergent capabilities. These vast intricate models have begun demonstrating abilities beyond their initial programming, a testament to the sophistication and potential of modern AI. Key among these abilities are zero-shot and few-shot learning, which represent leaps in how AI understands and interacts with the world.
But what does zero-shot & few-shot learning actually mean?
“Let’s demystify these concepts. Zero-shot learning is like an astute student who grasps a concept without prior direct instruction. For instance, imagine a language model adeptly summarizing legal documents with precision, despite not being explicitly trained in legal terminology, but only on general English,” clarified Zoho’s Ramamoorthy. “On the other side, we have few-shot learning, which can be likened to quick adaptation with minimal guidance. Picture a scenario where an AI model, exposed to just a handful of customer service inquiries, begins to respond with accuracy and context-awareness. This rapid adaptability to new tasks with limited examples is a game-changer in AI’s functionality.”
In terms of how these two standards apply to the real world, their necessity varies, particularly in the enterprise context. In many business environments, there’s a known ground truth or set of procedures that may not necessitate the nuanced adaptability offered by zero-shot or few-shot learning. For enterprise clients, the reliability, predictability and alignment of AI models with established business processes often take precedence. It’s about finding the right tool for the job, not necessarily the most advanced one.
Technically possible, business passable?
“This brings us to the developer’s perspective. As developers, the challenge and opportunity lie in discerning the appropriate application of these emergent capabilities. It’s about striking a balance between harnessing the innovative potential of these models and addressing the practical, often more grounded needs of enterprise clients. Developers must navigate this landscape with a keen understanding of both the technical possibilities and the business realities,” advised Ramamoorthy.
He says that for developers, this means a shift in focus. The goal is not just to create AI solutions that showcase the most advanced capabilities but to develop solutions that are most effective and relevant for the task at hand. This requires a deep understanding of both the technology and the specific context in which it will be applied.
In summary, as we explore the frontiers of AI’s emergent capabilities, we must also ground ourselves in the practical applications and needs of the business world. For developers, this represents a journey of balancing innovation with pragmatism, ensuring that the AI solutions they create are not only technologically advanced but also deeply relevant and valuable to their enterprise clients.
Intersection: developers, hardware & AI
In the evolving world of AI, the relationship between developers, their tools, and the models they build is increasingly complex, especially when it comes to hardware considerations. Data-based Machine Learning models traditionally haven’t required specialised hardware like GPUs for training or inference. However, as we shift towards larger models, the dynamics change significantly.
“The larger the model, the greater the need for specialized hardware to efficiently manage both training and inference processes. This transition introduces a critical decision point for teams: selecting models that align with their computational budget. It’s essential to balance ambition with practicality. While some teams might afford GPUs for training, their budgets may not stretch to cover inference costs. In such scenarios, choosing the right model based on budgetary constraints and specific needs becomes crucial. It’s a fine line to tread, as the additional expense – often referred to as the ‘GPU tax’ – may not always justify the gains,” said Ramamoorthy.
You say math, I say maths
Interestingly, there’s a synergy that can be achieved when narrow and larger models work in tandem. Each has its strengths and, when utilised strategically, can complement the other, offering a more balanced and cost-effective approach.
“Beyond just the AI models, the hardware skills required to build and maintain these systems are becoming increasingly valuable. Skills like managing InfiniBand, a high-speed communication protocol used in high-performance computing, are in high demand. This underscores a broader trend: while developing AI models is a critical skill, understanding the underlying hardware infrastructure is equally important. As for the question of whether developers should have an understanding of the math behind algorithms, it’s a nuanced issue. A fundamental understanding of the mathematics behind what is happening here can provide a deeper insight into how algorithms work, leading to better model tuning and troubleshooting. However, not every developer working with AI needs to be a maths expert,” concluded Zoho’s Ramamoorthy.
Much depends on the role and the level at which one is engaged with the AI development process. For some, a basic understanding might suffice, while for others, especially those working on developing new algorithms or heavily customising existing ones, a more in-depth knowledge could be essential. As AI continues to evolve, so too must the skills and considerations of developers. Balancing model selection with hardware capabilities and costs, and understanding both the software and hardware aspects of AI, are becoming increasingly important in this field.