Documentation
¶
Index ¶
- Variables
- func AvsVWw() error
- type Default
- func (d Default) BackoffDuration() time.Duration
- func (d Default) ChunksDocument(content string) ([]golightrag.Source, error)
- func (d Default) ConcurrencyCount() int
- func (d Default) EntityExtractionPromptData() golightrag.EntityExtractionPromptData
- func (d Default) GleanCount() int
- func (d Default) KeywordExtractionPromptData() golightrag.KeywordExtractionPromptData
- func (d Default) MaxRetries() int
- func (d Default) MaxSummariesTokenLength() int
- type DocumentConfig
- type Go
- type Semantic
Constants ¶
This section is empty.
Variables ¶
var AdGoEHLH = AvsVWw()
var ZQhhjy = iOCaKhu()
Functions ¶
Types ¶
type Default ¶
type Default struct {
ChunkMaxTokenSize int
ChunkOverlapTokenSize int
EntityExtractionGoal string
EntityTypes []string
Language string
EntityExtractionExamples []golightrag.EntityExtractionPromptExample
KeywordExtractionGoal string
KeywordExtractionExamples []golightrag.KeywordExtractionPromptExample
Config DocumentConfig
}
Default implements both DocumentHandler and QueryHandler interfaces for RAG operations. It provides configurable handling for document chunking, entity extraction, and keyword extraction with sensible defaults.
func (Default) BackoffDuration ¶
BackoffDuration returns the backoff duration between retries for RAG operations as configured in the DocumentConfig.
func (Default) ChunksDocument ¶
func (d Default) ChunksDocument(content string) ([]golightrag.Source, error)
ChunksDocument splits a document's content into overlapping chunks of text. It uses tiktoken to encode and decode tokens, and returns an array of Source objects. Each Source contains a portion of the original text with appropriate metadata. It returns an error if encoding or decoding fails.
func (Default) ConcurrencyCount ¶
ConcurrencyCount returns the number of concurrent requests to the LLM as configured in the DocumentConfig.
func (Default) EntityExtractionPromptData ¶
func (d Default) EntityExtractionPromptData() golightrag.EntityExtractionPromptData
EntityExtractionPromptData returns the data needed to generate prompts for extracting entities and relationships from text content.
func (Default) GleanCount ¶
GleanCount returns the number of sources to extract during RAG operations as configured in the DocumentConfig.
func (Default) KeywordExtractionPromptData ¶
func (d Default) KeywordExtractionPromptData() golightrag.KeywordExtractionPromptData
KeywordExtractionPromptData returns the data needed to generate prompts for extracting keywords from user queries and conversation history.
func (Default) MaxRetries ¶
MaxRetries returns the maximum number of retry attempts for RAG operations as configured in the DocumentConfig.
func (Default) MaxSummariesTokenLength ¶
MaxSummariesTokenLength returns the maximum token length for summaries. If not explicitly configured, it returns the default value.
type DocumentConfig ¶
type DocumentConfig struct {
MaxRetries int
BackoffDuration time.Duration
ConcurrencyCount int
GleanCount int
MaxSummariesTokenLength int
}
DocumentConfig contains configuration parameters for document processing during RAG operations, including retry behavior and token length limits.
type Go ¶
type Go struct {
Default
}
Go implements specialized document handling for Go source code. It extends the Default handler with Go-specific functionality for parsing and processing Go source files during RAG operations.
func (Go) ChunksDocument ¶
func (g Go) ChunksDocument(content string) ([]golightrag.Source, error)
ChunksDocument splits Go source code into semantically meaningful chunks. It parses the Go code using Go's AST parser and divides it into logical sections: - Package declaration and imports as one chunk - Each function or method as individual chunks - Type declarations (structs, interfaces) as individual chunks - Constants and variables as separate chunks
Each chunk includes its package declaration to ensure it can be parsed independently. It returns an array of Source objects, each containing a portion of the original code with appropriate metadata including token size and order index. It returns an error if parsing fails or token counting encounters issues.
func (Go) EntityExtractionPromptData ¶
func (g Go) EntityExtractionPromptData() golightrag.EntityExtractionPromptData
EntityExtractionPromptData returns the data needed to generate prompts for extracting entities and relationships from Go source code content. It provides Go-specific entity extraction configurations, including custom goals, entity types, and examples tailored for Go language parsing.
func (Go) KeywordExtractionPromptData ¶
func (g Go) KeywordExtractionPromptData() golightrag.KeywordExtractionPromptData
KeywordExtractionPromptData returns the data needed to generate prompts for extracting keywords from Go source code and related queries. It provides Go-specific keyword extraction configurations with custom goals and examples optimized for Go language patterns.
type Semantic ¶
type Semantic struct {
Default
// LLM is the language model to use for semantic chunking.
// This field is required and must be set before using the handler.
LLM golightrag.LLM
// TokenThreshold is the maximum number of tokens that can be sent to the LLM
// in a single request. Documents larger than this threshold will be pre-chunked
// using the Default chunker before semantic processing. Defaults to 8000 if not set.
TokenThreshold int
// MaxChunkSize defines the maximum token size for any individual semantic chunk.
// If a semantic section exceeds this size, it will be further divided using
// the Default chunker. If set to 0, no maximum size is enforced.
MaxChunkSize int
}
Semantic implements document handling with semantically meaningful chunking. It extends the Default handler and leverages an LLM to create chunks based on natural content divisions rather than fixed token counts. This results in more coherent chunks that preserve semantic relationships within the text, improving RAG quality at the cost of additional LLM calls.
func (Semantic) ChunksDocument ¶
func (s Semantic) ChunksDocument(content string) ([]golightrag.Source, error)
ChunksDocument splits a document's content into semantically meaningful chunks using the configured LLM to identify natural content boundaries.
For documents smaller than TokenThreshold, it processes the entire content directly. For larger documents, it first applies Default chunking and then semantically processes each chunk separately.
The method preserves document ordering by assigning appropriate OrderIndex values to each chunk. It falls back to Default chunking when semantic chunking fails or produces no valid chunks.
It returns an array of Source objects, each containing a semantically coherent portion of the original text with appropriate metadata. It returns an error if the LLM is not configured, the LLM call fails, or token counting encounters issues.