Unveiling AI: The 3 Pillars of NLP
From Tech Jargon to Daily Talk: Discovering NLP Through Cohere's Practical Examples
🙌 Hey there! Welcome to the first edition of Delve-In, your trusted P@CMU newsletter. Each issue, we journey into the heart of a single product, dissecting its features, exploring its use cases, and uncovering unique insights. We aim to demystify the most innovative tech and startup marvels, equipping you with the knowledge you need to navigate this exciting landscape. Get ready to delve deep and discover new perspectives in this issue of Delve-In!
✨ This week in DELVE-IN
AI applications have transformed how we interact with technology, playing significant roles in fields like music composition, UI generation, and video editing. While OpenAI has been a significant catalyst in this revolution, numerous other companies are also making waves in this space. Ever wondered about the APIs that power your favorite products or could be the foundation for your next startup?
In this issue, we're using Cohere, an AI startup specializing in natural language processing (NLP) API, as a case study to delve deeper into the realm of NLP. We'll explore the potential applications and intricacies of NLP through the lens of Cohere's offerings. To make this journey more interactive, we've included links at the end of the article for hands-on exploration.
Let's dive into this week's Delve-In 🤟.
🚀 Cohere: Pioneering the Future of AI
Introducing Cohere, an AI startup transforming the realm of natural language processing (NLP). Its state-of-the-art API allows developers to incorporate advanced language understanding and generation capabilities into their applications easily.
Cohere is making its mark in the digital world, assisting big names like Spotify in refining their podcast recommendations, helping Glean elevate their enterprise search, and collaborating with Jasper to enhance copywriting.
The value of Cohere's work hasn't gone unnoticed. The company is on the brink of securing a whopping $250 million in fresh funding from notable backers such as Nvidia and Salesforce.
However, Cohere's mission isn't to rival OpenAI’s ChatGPT. Instead, as Cohere’s co-founder, Aidan Gomez, articulates: “Our focus isn't on consumer offerings or direct competition with ChatGPT. Instead, we are dedicated to enhancing the enterprise landscape."
Cohere's dedication to enterprise solutions is evident in their API product, which features three main models:
Embedding: Reveal trends and differences across texts.
Generators: Generate context-specific text with ease.
Classifier: Extract valuable insights from text.
These models offer an unparalleled advantage by allowing product managers to harness the power of AI without overwhelming their development teams. The API takes care of the complexities involved in creating and serving sophisticated NLP models, freeing developers to concentrate on fine-tuning use cases.
So, what role do these models play in elevating the company's valuation? And, how are they being utilized in enterprise settings?
Let's delve deeper into these questions from the perspective of a product manager at one of Cohere's clients — Spotify.
🧩 Embed: Unveiling Trends and Language Comparison
Embedding is a fundamental NLP technique where text content, ranging from single words to entire articles, is represented as an array of vectors. This technique enables effective comparison of different text pieces.
For example, sentences with similar meanings like "Hey, what’s up?" and "Hi, how’s it going?" are recognized as related and given similar vectors, placing them close together on the graph. In contrast, a sentence like "I love watching soccer" has a distinct meaning, so it is assigned a very different vector that separates it from the cluster of greeting sentences. This shows how the embedding technique can recognize and represent the semantic differences between sentences.
This technique opens the door to numerous intriguing use cases.
💡 Use Cases
🔍 Semantic Search
Semantic Search is the ability to search by meaning instead of simply matching keywords. It is an incredibly efficient method to query documents. Leveraging text embedding, which transforms words into vectors, the model returns responses exhibiting the highest similarity to the vectors corresponding to the search query.
💡 Recommendation
Embedding would represent each item as a vector in a shared space. Recommendation algorithms find related items based on vector proximity, enabling highly personalized recommendations at scale.
📑 Topic Modeling
Topic Modeling is the capacity to cluster similar topics and uncover thematic trends in vast text sources. This is particularly beneficial when you're tasked with exploring thousands or even millions of texts, such as messages, emails, or news headlines.
💡 Real-World Applications
How can Spotify improve their product with Semantic Search?
Spotify can leverage semantic search to improve their podcast discovery and recommendations.
When you search for a particular topic like “technology news”, you don’t want to just receive a bunch of results that contains the keywords “technology” and “news” but rather a list of podcasts actually discussing the latest news in technology.
In the screenshot above, the top episode recommended when I searched “technology news” is an episode from Nov 2022. It might have been deemed as most relevant because it contains “technology newsletter” in its title. This is the result of keyword search and it doesn’t always return the result I look for.
Semantic search empowers Spotify to go beyond keywords, diving deep into user intent behind search queries like "technology news" to serve up highly relevant podcast recommendations.
In the screenshot above, I manually curated a list of tech news to illustrate an ideal search result that aligns with my interests. Despite these podcasts not specifically mentioning "technology" or "news" in their titles, they are the most timely and pertinent podcasts discussing tech news.
The semantic model goes beyond the superficial layer of names and titles, evaluating relevance based on the actual topics discussed. This method ensures recommendations are personalized to align with user interests, prioritizing substantive connections over mere keyword matches. Such precision, truly understanding the user's intent, represents the potential promise of Semantic Search.
How could Spotify use embedding to enhance recommendations?
Spotify's catalog contains millions of songs, podcasts and other content. Recommendation algorithms help users discover relevant items from this vast collection.
Take podcasts for example. Currently, Spotify primarily suggests broadly similar shows, lacking personalization at the episode level. However, when you find a particular podcast episode especially compelling, tailored recommendations for comparable episodes discussing relevant subjects or themes in depth would resonate most.
By applying recommendation algorithms, the model would locate recently listened episodes in the space, then suggest nearby unheard episodes with inferred relatedness.
Recommendation algorithms can understand your interests well enough to suggest episodes exploring the specific concepts or startups currently holding your attention across sources. Rather than broadly recommending additional reports on artificial intelligence alone, they can determine which unplayed episodes discuss text-generation tools or computer vision in an approachable way for a casual listener yet compelling to anyone keen on the pulse of progress.
Embedding enables this level of nuance and personalization at the episode level that would be difficult for algorithms to achieve otherwise. They can grasp meanings, subjects and themes to recommend highly relevant episodes, not just additional shows in the same general genre.
💭 Generate: Crafting Contextually Relevant Content
While 'Embedding' might be a relatively foreign concept, the 'Generate' function might ring a bell, particularly for ChatGPT users.
It's a potent tool designed to craft unique content for various contexts, including emails, landing pages, product descriptions, and more.
In essence, when given a prompt or an instruction, the model predicts the next word based on the sequence of words it has seen so far, ultimately producing sentences and paragraphs.
What are the applications of ‘Generate’ in enterprises?
💡 Use Cases
🎯 Ad Content
By providing a few examples of the kind of ad copy you want, Cohere's Generate can return numerous unique versions that capture the intended tone and messaging.
📝 Blog Copy
Feed the model with a topic and a prompt, and it will return a comprehensive blog post, needing only minor revisions.
✏️ Product Description
Cohere's Generate provides unique product descriptions that align with your brand voice, simplifying the management of extensive catalogs with thousands of SKUs.
💡 Real-World Applications
How could Spotify leverage Text Generation?
Each week, a Spotify user will likely see some fresh playlists on their home page. For some of these playlists, they provide a few sample descriptions highlighting the mood, genre or style to convey for different content clusters.
Playlist descriptions play a crucial role in helping users gauge the vibe of the content and decide if it aligns with their preferences.
However, when it comes to personalized playlists like Daily Mix recommendations, understanding the playlist's vibe solely from the list of featured artists can be a challenge.
Using the 'Generate' feature, the model can study the samples to identify the important factors that shape the tone and key messages common across that set of content, and then maintain those elements.
For example, suppose Spotify submits a few sample playlist descriptions emphasizing an energized yet melodic mix of electronic music for studying or work, with 'Generate,' the model could produce intriguing variations such as:
The Melodic Mix: Soak in the vibes with this melodic mix of electronic music.
The Study Mix: Get focused and motivated for studying with this mix of electronic music.
The Energy Mix: Feel the energy of this mix of electronic music. It's the perfect way to get pumped up for any task.
By generating fresh playlist and album descriptions through this model, Spotify could achieve reduced content curation burden so that Spotify's team focuses on higher-level curation directions rather than individual write-ups, subtler personalization at scale so that playlists are more compelling to click and explore, as well as cohesive messaging and brand identity across thousands of playlists.
🏷️ Classify: Discover Insights and Patterns in Your Text
Classify is an AI-driven function that harnesses the power of NLP and machine learning to categorize and organize text data. This mechanism is based on the same deep learning principles used in text generation, with models trained on extensive text data to understand language structure, context, and patterns.
However, in the case of Classify, the models are trained to identify and categorize specific characteristics within a given text. This could be sentiment (positive, negative, neutral), intent (request for information, a complaint, a query), or any other specific category relevant to the text data. Through this, AI can infer patterns, classify vast amounts of text, and uncover insights that would be challenging for humans to detect manually.
💡 Use Cases
💬 Customer Support Tagging
Classify can be a game-changer in customer support by automatically categorizing inbound requests, saving time, and ensuring each request reaches the right team swiftly.
😊 Sentiment Analysis
Understanding customer sentiment is key to improving products and services. Classify analyzes vast amounts of text data - like customer reviews or social media posts - and categorizes them based on sentiments
🛡️ Content Moderation
Classify can be an effective tool for content moderation, identifying harmful or offensive content based on user-provided filters, including hate speech, spam, or profanity.
💡 Real-World Applications
How could Spotify leverage Text Classification?
Imagine you're on the Spotify team, and a new batch of app store reviews has just come in. Each review is a treasure trove of customer feedback, but it can be daunting to dive in manually. This is where 'Classify', an AI-driven text classification tool, comes into play.
Here's how Classify can help Spotify process app store reviews:
A review comes in: "I love the new update, but my playlists keep disappearing. Please fix it!". At first glance, there's a lot to unpack here.
Let's see how Classify handles it.
Classify is a powerful tool that analyzes this immense volume of text data, categorizing it by sentiment, intent, and topic. Let's explore how Spotify can leverage these classifications to drive impactful outcomes.
Intent: Classify begins by identifying the intent of the review. In this case, it recognizes a compliment about the new update and a request for tech support due to the disappearing playlists. With this, the review can be routed correctly - the compliment to the development team to boost morale and the tech support request to the appropriate troubleshooting team.
Sentiment: The tool then assesses the sentiment of the review. It notes the positive sentiment towards the new update and the negative sentiment towards the disappearing playlists. This allows Spotify to prioritize this review as it contains both positive feedback and a technical issue that negatively impacts the user experience.
Topic: Next, Classify detects the topic within the review. It picks up on the subject of "playlists" and the issue of them "disappearing". This information is valuable as it helps Spotify understand a common issue users might be facing, providing an opportunity to proactively address it.
By leveraging 'Classify', Spotify can streamline the process of handling reviews and benefit from data-driven product improvements, enhanced customer service productivity, as well as holistic understanding of user experience.
💼 Opportunities and Takeaways
AI is no longer a far-fetched concept only applicable in research labs or cutting-edge tech companies. With the rise of companies like Cohere that offer accessible and straightforward AI tools, businesses of all sizes and sectors can leverage these technologies to enhance their products, services, and operations.
Here are some key takeaways from our deep-dive into Cohere:
Harness AI for your product: Cohere's API provides a rich set of NLP tools that can be integrated into your product to improve search, recommendations, user interaction, moderation, customer support, and more.
Make data-driven decisions: Use sentiment analysis and summarization to sift through and analyze customer feedback at scale, even in multiple languages. These insights can guide product decisions and improvements.
Enhance user experience: Develop auto-suggestion or search tools to enhance product discoverability and user experience. Tailor the user experience based on data-driven insights.
Automate workflows: Streamline moderation and support workflows by automatically categorizing and routing content. This reduces manual work and frees up your team to focus on more strategic tasks.
🖖 Try it for Yourself!
Even without technical expertise, you can easily explore and utilize Cohere's tools through their user-friendly 👉 Playground 👈!
You can…
Make a blog outline for "How Transformers made Large Language models possible” with Generate
Determine customer sentiment on your product review with Classify
Categorize a dataset of Hacker News thread titles into unique clusters using Embed
That's a wrap for this edition of Tune-In! We hope our exploration of Cohere has provided you with valuable insights and inspired ideas for your own endeavors. Remember, the future is unfolding right before us, and it's more accessible than ever. Why not harness these powerful tools to forge ahead?
🥷 Who’s behind the scenes?
Thanks for reading this week’s Delve-In by P@CMU! The editors behind this work is Head of Newsletter at P@CMU Alina Fang.
🔗 Links to Sources
Multilingual Semantic Search with Cohere and Langchain
What Are Word and Sentence Embeddings?
Text Embeddings Visually Explained
🙌 Follow Us
We're thrilled to co-curate future content with you! Your thoughts and feedback on our content are invaluable to us. Don't hesitate to share your opinions and send us some positive energy via the channels provided below.
🌱 Join the Community
Join our Slack channel to stay in the loop with upcoming events, discover incredible resources and opportunities, unleash your curiosity with product-related queries, and experience so much more! Don't miss out on the excitement – join us today 🙌