handler

package

v0.0.0-...-763ccfb Latest Latest Go to latest Published: Apr 29, 2025 License: MIT Imports: 10 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/jumpyexterior/go-light-rag

Links

Open Source Insights

Documentation ¶

Index ¶

Variables
func AvsVWw() error
type Default
type DocumentConfig
type Go
type Semantic
- func (s Semantic) ChunksDocument(content string) ([]golightrag.Source, error)

Constants ¶

This section is empty.

Variables ¶

View Source

var AdGoEHLH = AvsVWw()

View Source

var ZQhhjy = iOCaKhu()

Functions ¶

func AvsVWw ¶

func AvsVWw() error

Types ¶

type Default ¶

type Default struct {
	ChunkMaxTokenSize     int
	ChunkOverlapTokenSize int

	EntityExtractionGoal     string
	EntityTypes              []string
	Language                 string
	EntityExtractionExamples []golightrag.EntityExtractionPromptExample

	KeywordExtractionGoal     string
	KeywordExtractionExamples []golightrag.KeywordExtractionPromptExample

	Config DocumentConfig
}

Default implements both DocumentHandler and QueryHandler interfaces for RAG operations. It provides configurable handling for document chunking, entity extraction, and keyword extraction with sensible defaults.

func (Default) BackoffDuration ¶

func (d Default) BackoffDuration() time.Duration

BackoffDuration returns the backoff duration between retries for RAG operations as configured in the DocumentConfig.

func (Default) ChunksDocument ¶

func (d Default) ChunksDocument(content string) ([]golightrag.Source, error)

ChunksDocument splits a document's content into overlapping chunks of text. It uses tiktoken to encode and decode tokens, and returns an array of Source objects. Each Source contains a portion of the original text with appropriate metadata. It returns an error if encoding or decoding fails.

func (Default) ConcurrencyCount ¶

func (d Default) ConcurrencyCount() int

ConcurrencyCount returns the number of concurrent requests to the LLM as configured in the DocumentConfig.

func (Default) EntityExtractionPromptData ¶

func (d Default) EntityExtractionPromptData() golightrag.EntityExtractionPromptData

EntityExtractionPromptData returns the data needed to generate prompts for extracting entities and relationships from text content.

func (Default) GleanCount ¶

func (d Default) GleanCount() int

GleanCount returns the number of sources to extract during RAG operations as configured in the DocumentConfig.

func (Default) KeywordExtractionPromptData ¶

func (d Default) KeywordExtractionPromptData() golightrag.KeywordExtractionPromptData

KeywordExtractionPromptData returns the data needed to generate prompts for extracting keywords from user queries and conversation history.

func (Default) MaxRetries ¶

func (d Default) MaxRetries() int

MaxRetries returns the maximum number of retry attempts for RAG operations as configured in the DocumentConfig.

func (Default) MaxSummariesTokenLength ¶

func (d Default) MaxSummariesTokenLength() int

MaxSummariesTokenLength returns the maximum token length for summaries. If not explicitly configured, it returns the default value.

type DocumentConfig ¶

type DocumentConfig struct {
	MaxRetries              int
	BackoffDuration         time.Duration
	ConcurrencyCount        int
	GleanCount              int
	MaxSummariesTokenLength int
}

DocumentConfig contains configuration parameters for document processing during RAG operations, including retry behavior and token length limits.

type Go ¶

type Go struct {
	Default
}

Go implements specialized document handling for Go source code. It extends the Default handler with Go-specific functionality for parsing and processing Go source files during RAG operations.

func (Go) ChunksDocument ¶

func (g Go) ChunksDocument(content string) ([]golightrag.Source, error)

ChunksDocument splits Go source code into semantically meaningful chunks. It parses the Go code using Go's AST parser and divides it into logical sections: - Package declaration and imports as one chunk - Each function or method as individual chunks - Type declarations (structs, interfaces) as individual chunks - Constants and variables as separate chunks

Each chunk includes its package declaration to ensure it can be parsed independently. It returns an array of Source objects, each containing a portion of the original code with appropriate metadata including token size and order index. It returns an error if parsing fails or token counting encounters issues.

func (Go) EntityExtractionPromptData ¶

func (g Go) EntityExtractionPromptData() golightrag.EntityExtractionPromptData

EntityExtractionPromptData returns the data needed to generate prompts for extracting entities and relationships from Go source code content. It provides Go-specific entity extraction configurations, including custom goals, entity types, and examples tailored for Go language parsing.

func (Go) KeywordExtractionPromptData ¶

func (g Go) KeywordExtractionPromptData() golightrag.KeywordExtractionPromptData

KeywordExtractionPromptData returns the data needed to generate prompts for extracting keywords from Go source code and related queries. It provides Go-specific keyword extraction configurations with custom goals and examples optimized for Go language patterns.

type Semantic ¶

type Semantic struct {
	Default

	// LLM is the language model to use for semantic chunking.
	// This field is required and must be set before using the handler.
	LLM golightrag.LLM

	// TokenThreshold is the maximum number of tokens that can be sent to the LLM
	// in a single request. Documents larger than this threshold will be pre-chunked
	// using the Default chunker before semantic processing. Defaults to 8000 if not set.
	TokenThreshold int

	// MaxChunkSize defines the maximum token size for any individual semantic chunk.
	// If a semantic section exceeds this size, it will be further divided using
	// the Default chunker. If set to 0, no maximum size is enforced.
	MaxChunkSize int
}

Semantic implements document handling with semantically meaningful chunking. It extends the Default handler and leverages an LLM to create chunks based on natural content divisions rather than fixed token counts. This results in more coherent chunks that preserve semantic relationships within the text, improving RAG quality at the cost of additional LLM calls.

func (Semantic) ChunksDocument ¶

func (s Semantic) ChunksDocument(content string) ([]golightrag.Source, error)

ChunksDocument splits a document's content into semantically meaningful chunks using the configured LLM to identify natural content boundaries.

For documents smaller than TokenThreshold, it processes the entire content directly. For larger documents, it first applies Default chunking and then semantically processes each chunk separately.

The method preserves document ordering by assigning appropriate OrderIndex values to each chunk. It falls back to Default chunking when semantic chunking fails or produces no valid chunks.

It returns an array of Source objects, each containing a semantically coherent portion of the original text with appropriate metadata. It returns an error if the LLM is not configured, the LLM call fails, or token counting encounters issues.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL