Langchain Overview
Langchain Overview
Langchain is a framework for building applications powered by language models. It provides tools, components, and interfaces to connect LLMs (like GPT-4 or Claude) with external data sources, APIs, and complex workflows.
Core Concept
Traditional LLM applications have limitations:
- Fixed knowledge cutoff - Models only know what they were trained on
- No external data - Can't access your databases or files
- Single-turn responses - No memory or context between requests
- No tool usage - Can't perform actions like web searches or calculations
Langchain solves this by providing a framework to:
- Connect LLMs to external data sources (vector databases, APIs, files)
- Chain multiple operations together (search β retrieve β synthesize)
- Give models memory across conversations
- Enable tool usage (calculators, web search, custom functions)
How Langchain Works
Core Components
1. Models The LLM itself (GPT-4, Claude, Llama, etc.)
javascriptimport { ChatOpenAI } from "langchain/chat_models/openai"; const model = new ChatOpenAI({ temperature: 0.7 });
2. Prompts Templates for structuring inputs to the model
javascriptconst prompt = PromptTemplate.fromTemplate( "Explain {concept} in the context of {domain}" );
3. Chains Sequences of operations that connect components
javascriptconst chain = prompt.pipe(model); const result = await chain.invoke({ concept: "embeddings", domain: "music analysis" });
4. Memory Stores conversation history and context
javascriptconst memory = new BufferMemory(); memory.saveContext({ input: "What is Convex?" }, { output: "..." });
5. Retrievers Fetch relevant data from external sources
javascriptconst retriever = vectorStore.asRetriever(); const docs = await retriever.getRelevantDocuments("song analysis patterns");
6. Agents LLMs that can decide which tools to use and when
javascriptconst agent = await createAgent({ tools: [searchTool, calculatorTool, databaseTool], llm: model });
When to Use Langchain
β Use Langchain when you need:
- RAG (Retrieval Augmented Generation) - Query your own data with LLMs
- Complex workflows - Multi-step processes with branching logic
- Memory and context - Conversations that remember previous exchanges
- Tool integration - LLMs that can call APIs, search databases, run code
- Rapid prototyping - Pre-built components for common patterns
β Don't use Langchain when:
- Simple single-prompt applications (just use the LLM API directly)
- No external data needed
- Performance is critical (Langchain adds overhead)
- You want full control over every detail (framework abstracts complexity)
Real-World Examples
Example 1: Music Analysis with Your Library
Problem: "Find all songs in my library where the bass line follows a descending chromatic pattern"
Without Langchain:
- Manually write code to search database
- Write code to format results
- Write code to ask LLM to analyze each song
- Write code to aggregate results
With Langchain:
javascriptimport { createRetrievalChain } from "langchain/chains/retrieval"; const chain = createRetrievalChain({ retriever: songDatabase.asRetriever(), llm: model, combineDocsChain: stuffDocumentsChain }); const result = await chain.invoke({ input: "Find songs with descending chromatic bass lines" });
Example 2: Research Assistant for PhD
Problem: Search across song annotations, literature notes, and NNT scores
Langchain Solution:
javascriptconst agent = await createAgent({ tools: [ convexRetriever, // Query Convex database vaultSearchTool, // Search Obsidian vault nntParserTool // Parse NNT notation ], llm: model }); await agent.invoke({ input: "What do my annotations say about tritone substitutions in bebop?" });
Agent decides which tools to use, combines results, synthesizes answer.
Langchain.js vs Python Langchain
Langchain.js (JavaScript/TypeScript):
- Better for web applications
- Integrates with React, Node.js, Convex
- Native browser support
- Smaller ecosystem (newer)
Python Langchain:
- More mature, larger community
- Better for data science workflows
- More integrations available
- Preferred for research prototypes
For your NNT ecosystem: Use Langchain.js - integrates with React components, Convex, and web deployment.
Key Concepts Explained
Chains vs Agents
Chains:
- Predefined sequence of steps
- Always executes in same order
- Fast and predictable
- Use for: Known workflows
Agents:
- LLM decides what to do
- Dynamic tool selection
- Slower but more flexible
- Use for: Complex, open-ended tasks
Embeddings and Vector Stores
Embeddings: Numerical representations of text that capture semantic meaning
javascriptconst embedding = await embeddings.embedQuery("jazz improvisation"); // β [0.23, -0.15, 0.87, ...] (vector of 1536 numbers)
Vector Store: Database optimized for similarity search
javascriptconst vectorStore = await MemoryVectorStore.fromDocuments( songAnnotations, embeddings ); // Find similar content semantically const similar = await vectorStore.similaritySearch( "chord substitutions", k: 5 );
RAG (Retrieval Augmented Generation)
Pattern:
- User asks question
- Embed question into vector
- Search vector store for similar content
- Pass retrieved content + question to LLM
- LLM generates answer using retrieved context
Why it works:
- Overcomes LLM knowledge cutoff
- Grounds answers in your data
- Reduces hallucination
- Works with private/specialized content
Common Patterns
Pattern 1: Simple RAG
javascriptconst chain = RetrievalQAChain.fromLLM( model, vectorStore.asRetriever() );
Pattern 2: Conversational RAG
javascriptconst chain = ConversationalRetrievalQAChain.fromLLM( model, vectorStore.asRetriever(), { memory: new BufferMemory() } );
Pattern 3: Agent with Tools
javascriptconst agent = await createAgent({ llm: model, tools: [searchTool, calculatorTool], agentType: "openai-functions" });
Integration with Your Stack
Langchain.js + Convex:
javascript// Store embeddings in Convex const convexVectorStore = new ConvexVectorStore(convex, { tableName: "song_embeddings", indexName: "by_embedding" }); // Query with semantic search const results = await convexVectorStore.similaritySearch( "modal interchange examples" );
Langchain.js + React Components:
javascriptfunction AnnotationSearch() { const [result, setResult] = useState(); const handleSearch = async (query) => { const chain = createRetrievalChain({ ... }); const answer = await chain.invoke({ input: query }); setResult(answer); }; return <SearchInterface onSearch={handleSearch} result={result} />; }
Getting Started
Installation:
bashnpm install langchain @langchain/openai
Minimal example:
javascriptimport { ChatOpenAI } from "@langchain/openai"; import { PromptTemplate } from "langchain/prompts"; const model = new ChatOpenAI(); const prompt = PromptTemplate.fromTemplate("Explain {topic}"); const chain = prompt.pipe(model); const result = await chain.invoke({ topic: "embeddings" }); console.log(result.content);
See Also
- Vectorized Databases - Deep dive into embeddings and vector search
- Langchain with Convex - Integration patterns for your stack
- Langchain Use Cases for NNT - Specific applications to your research
- ripgrep vs Vector Search - When to use each approach