# Langchain Overview
Langchain is a framework for building applications powered by language models. It provides tools, components, and interfaces to connect LLMs (like GPT-4 or Claude) with external data sources, APIs, and complex workflows.
## Core Concept
Traditional LLM applications have limitations:
- **Fixed knowledge cutoff** - Models only know what they were trained on
- **No external data** - Can't access your databases or files
- **Single-turn responses** - No memory or context between requests
- **No tool usage** - Can't perform actions like web searches or calculations
**Langchain solves this** by providing a framework to:
1. Connect LLMs to external data sources (vector databases, APIs, files)
2. Chain multiple operations together (search → retrieve → synthesize)
3. Give models memory across conversations
4. Enable tool usage (calculators, web search, custom functions)
## How Langchain Works
### Core Components
**1. Models**
The LLM itself (GPT-4, Claude, Llama, etc.)
```javascript
import { ChatOpenAI } from "langchain/chat_models/openai";
const model = new ChatOpenAI({ temperature: 0.7 });
```
**2. Prompts**
Templates for structuring inputs to the model
```javascript
const prompt = PromptTemplate.fromTemplate(
"Explain {concept} in the context of {domain}"
);
```
**3. Chains**
Sequences of operations that connect components
```javascript
const chain = prompt.pipe(model);
const result = await chain.invoke({
concept: "embeddings",
domain: "music analysis"
});
```
**4. Memory**
Stores conversation history and context
```javascript
const memory = new BufferMemory();
memory.saveContext({ input: "What is Convex?" }, { output: "..." });
```
**5. Retrievers**
Fetch relevant data from external sources
```javascript
const retriever = vectorStore.asRetriever();
const docs = await retriever.getRelevantDocuments("song analysis patterns");
```
**6. Agents**
LLMs that can decide which tools to use and when
```javascript
const agent = await createAgent({
tools: [searchTool, calculatorTool, databaseTool],
llm: model
});
```
## When to Use Langchain
**✅ Use Langchain when you need:**
- **RAG (Retrieval Augmented Generation)** - Query your own data with LLMs
- **Complex workflows** - Multi-step processes with branching logic
- **Memory and context** - Conversations that remember previous exchanges
- **Tool integration** - LLMs that can call APIs, search databases, run code
- **Rapid prototyping** - Pre-built components for common patterns
**❌ Don't use Langchain when:**
- Simple single-prompt applications (just use the LLM API directly)
- No external data needed
- Performance is critical (Langchain adds overhead)
- You want full control over every detail (framework abstracts complexity)
## Real-World Examples
### Example 1: Music Analysis with Your Library
**Problem:** "Find all songs in my library where the bass line follows a descending chromatic pattern"
**Without Langchain:**
1. Manually write code to search database
2. Write code to format results
3. Write code to ask LLM to analyze each song
4. Write code to aggregate results
**With Langchain:**
```javascript
import { createRetrievalChain } from "langchain/chains/retrieval";
const chain = createRetrievalChain({
retriever: songDatabase.asRetriever(),
llm: model,
combineDocsChain: stuffDocumentsChain
});
const result = await chain.invoke({
input: "Find songs with descending chromatic bass lines"
});
```
### Example 2: Research Assistant for PhD
**Problem:** Search across song annotations, literature notes, and NNT scores
**Langchain Solution:**
```javascript
const agent = await createAgent({
tools: [
convexRetriever, // Query Convex database
vaultSearchTool, // Search Obsidian vault
nntParserTool // Parse NNT notation
],
llm: model
});
await agent.invoke({
input: "What do my annotations say about tritone substitutions in bebop?"
});
```
Agent decides which tools to use, combines results, synthesizes answer.
## Langchain.js vs Python Langchain
**Langchain.js (JavaScript/TypeScript):**
- Better for web applications
- Integrates with React, Node.js, Convex
- Native browser support
- Smaller ecosystem (newer)
**Python Langchain:**
- More mature, larger community
- Better for data science workflows
- More integrations available
- Preferred for research prototypes
**For your NNT ecosystem:** Use Langchain.js - integrates with React components, Convex, and web deployment.
## Key Concepts Explained
### Chains vs Agents
**Chains:**
- Predefined sequence of steps
- Always executes in same order
- Fast and predictable
- Use for: Known workflows
**Agents:**
- LLM decides what to do
- Dynamic tool selection
- Slower but more flexible
- Use for: Complex, open-ended tasks
### Embeddings and Vector Stores
**Embeddings:** Numerical representations of text that capture semantic meaning
```javascript
const embedding = await embeddings.embedQuery("jazz improvisation");
// → [0.23, -0.15, 0.87, ...] (vector of 1536 numbers)
```
**Vector Store:** Database optimized for similarity search
```javascript
const vectorStore = await MemoryVectorStore.fromDocuments(
songAnnotations,
embeddings
);
// Find similar content semantically
const similar = await vectorStore.similaritySearch(
"chord substitutions",
k: 5
);
```
### RAG (Retrieval Augmented Generation)
**Pattern:**
1. User asks question
2. Embed question into vector
3. Search vector store for similar content
4. Pass retrieved content + question to LLM
5. LLM generates answer using retrieved context
**Why it works:**
- Overcomes LLM knowledge cutoff
- Grounds answers in your data
- Reduces hallucination
- Works with private/specialized content
## Common Patterns
### Pattern 1: Simple RAG
```javascript
const chain = RetrievalQAChain.fromLLM(
model,
vectorStore.asRetriever()
);
```
### Pattern 2: Conversational RAG
```javascript
const chain = ConversationalRetrievalQAChain.fromLLM(
model,
vectorStore.asRetriever(),
{ memory: new BufferMemory() }
);
```
### Pattern 3: Agent with Tools
```javascript
const agent = await createAgent({
llm: model,
tools: [searchTool, calculatorTool],
agentType: "openai-functions"
});
```
## Integration with Your Stack
**Langchain.js + Convex:**
```javascript
// Store embeddings in Convex
const convexVectorStore = new ConvexVectorStore(convex, {
tableName: "song_embeddings",
indexName: "by_embedding"
});
// Query with semantic search
const results = await convexVectorStore.similaritySearch(
"modal interchange examples"
);
```
**Langchain.js + React Components:**
```javascript
function AnnotationSearch() {
const [result, setResult] = useState();
const handleSearch = async (query) => {
const chain = createRetrievalChain({ ... });
const answer = await chain.invoke({ input: query });
setResult(answer);
};
return <SearchInterface onSearch={handleSearch} result={result} />;
}
```
## Getting Started
**Installation:**
```bash
npm install langchain @langchain/openai
```
**Minimal example:**
```javascript
import { ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "langchain/prompts";
const model = new ChatOpenAI();
const prompt = PromptTemplate.fromTemplate("Explain {topic}");
const chain = prompt.pipe(model);
const result = await chain.invoke({ topic: "embeddings" });
console.log(result.content);
```
## See Also
- [[Vectorized Databases]] - Deep dive into embeddings and vector search
- [[Langchain with Convex]] - Integration patterns for your stack
- [[Langchain Use Cases for NNT]] - Specific applications to your research
- [[ripgrep vs Vector Search]] - When to use each approach