rfcquery

package module
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 1, 2025 License: MIT Imports: 5 Imported by: 0

README

rfcquery

Go Reference Go Report Card

A strict RFC3986-compliant query string parser for Go with a pluggable architecture. Parse URL queries with precision, extensibility, and performance.

The Problem

Go's standard library url.ParseQuery() implements application/x-www-form-urlencoded, not the full RFC3986 query specification:

  • ❌ Rejects valid RFC3986 characters (:, /, ?, @) unless encoded
  • ❌ Treats + as space (HTML form-specific, not RFC3986)
  • ❌ Assumes key-value pairs only (RFC3986 allows any structure)
  • ❌ No token-level access for custom parsing logic

The Solution

rfcquery separates lexical validation from semantic parsing:

  1. Lexer Layer: Strict RFC3986 validation with position-aware errors
  2. Token Stream: Fine-grained access to query characters with lookahead
  3. Plugin System: Pluggable parsers for different query formats

Quick Start

go get github.com/CRSylar/rfcquery

Examples

package main

import (
    "fmt"
    "log"
    "github.com/CRSylar/rfcquery"
)

func main() {
    query := "filter[name]=John%20Doe&sort=created@asc"
    
    // Validate RFC3986 compliance
    scanner := rfcquery.NewScanner(query)
    if err := scanner.Valid(); err != nil {
        log.Fatal(err) // rfcquery: invalid character ' ' at position 7
    }
    
    // Parse as form-urlencoded (RFC3986-compliant)
    values, err := rfcquery.ParseFormURLEncoded(query)
    if err != nil {
        log.Fatal(err)
    }
    
    fmt.Printf("Filter: %s\n", values.Get("filter[name]").Value)
    // Output: Filter: John Doe
}

Features

RFC3986 Strict Validation
query := "user:pass@host/path?search"
scanner := rfcquery.NewScanner(url.QueryEscape(query))
err := scanner.Valid() // nil - valid RFC3986

validates percent-encoding, character classes, and provides precise error positions.

Token Stream API
scanner := rfcquery.NewScanner("name=John%20Doe")

// Token-by-token access
for {
    tok, err := scanner.NextToken()
    if err != nil {
        log.Fatal(err)
    }
    if tok.Type == rfcquery.TokenEOF {
        break
    }
    fmt.Printf("%s: %q (decoded: %q)\n", tok.Type, tok.Value, tok.Decoded)
}
Bulk Collection for perfomance:
// Collect until condition
tokens, err := scanner.CollectWhile(func(tok rfcquery.Token) bool {
    return tok.Type != rfcquery.TokenSubDelims || tok.Value != "&"
})

// Reconstruct strings
original := tokens.String()           // "name=John%20Doe"
decoded := tokens.StringDecoded()     // "name=John Doe"
Plugin Architecture:

Built-in parsers with a common interface:

  1. Form URL-Encoded ( application/x-www-form-urlencoded)

    parser := &rfcquery.FormURLEncodedParser{
        PreserveInsertionOrder: true,
        AllowDuplicateKeys:     true,
    }
    
    scanner := rfcquery.NewScanner("tags=go&tags=library")
    values, err := parser.Parse(scanner)
    
    // Access all values for a key
    tags := values.Get("tags") // ["go", "library"]
    

    Advantages over net/url:

    • Preserves insertion order
    • Handles RFC3986 special characters ( :, @, /, ? )
    • Token-level metadata ( position, raw values)
  2. JSON-in-query extract JSON from query parameter values:

    query := `filter={"name":"John","age":30}&sort=created` // <-- NOTE: `{ / " / , / }` characters must be encoded to be valid in RFC3986, here is kept in plain text just for you to visually understand what is going on
    
    result, err := rfcquery.ParseJSONQuery(query, "filter")
    if err != nil {
        log.Fatal(err)
    }
    
    // Access the parsed JSON
    filterData := result.(map[string]interface{})
    fmt.Println(filterData["name"]) // "John"
    

    Features:

    • Parses percent-encoded JSON (%7B%22name%22%3A%22John%22%7D)
    • Handles special characters without mangling
    • Supports arrays, objects, primitives
    • Optional: can parse entire query string as JSON ( without the 'key' )
  3. GraphQL-over-HTTP Extract GraphQL queries from URL parameters (per GraphQL-over-HTTP spec):

    query := `query=query GetUser($id: ID!) { user(id: $id) { name } }&variables={"id": "123"}`
    
    graphql, err := rfcquery.ParseGraphQLQuery(query)
    if err != nil {
        log.Fatal(err)
    }
    
    fmt.Println(graphql.Query)         // The GraphQL query document
    fmt.Println(graphql.OperationName) // "GetUser"
    fmt.Println(graphql.Variables)     // map[id:123]
    

    Features:

    • Handles special GraphQL characters (@, !, $) without percent-encoding
    • Parses optional variables JSON parameter
    • Supports operationName for multiple operations
    • Works with percent-encoded queries
  4. TMF API Guidelines (TMF630) Parse complex filter expressions following TMF630 guidelines:

    query := "dateTime%3E%3D2013-04-20;status=active,suspended&sort=-created,+name&limit=10"
    
    tmf, err := rfcquery.ParseTMFQuery(query)
    if err != nil {
        log.Fatal(err)
    }
    
    // Access filter expressions
    for _, expr := range tmf.Expressions {
        fmt.Printf("%s %s %s\n", expr.Field, expr.Operator, expr.Value)
        // Output: dateTime gte 2013-04-20
        //         status eq active,suspended
    }
    
    // Access sorting
    for _, sort := range tmf.Sorting {
        fmt.Printf("Sort by %s (%s)\n", sort.Field, sort.Direction)
        // Output: Sort by created (desc)
        //         Sort by name (asc)
    }
    
    // Access pagination params
    limit := tmf.OtherParams["limit"][0] // "10"
    
    • Encoded Operators ( %3E for >, %3C for <, etc..)
    • Multiple separators ( = and ; treated identically)
    • List values (comma-separated)
    • Sorting with +/- prefixes
    • MultipleOperators on same field (date%3E2017-04-01;date%3C2017-05-01)
    • Values containing encoded opeartors ( no false positives)
    • RFC3986-compliant ( special chars like @, :, / work correctly)
  5. Custom Parser To implement a custom parser implement the Parser interface

    type MyCustomParser struct{}
    
    func (p *MyCustomParser) Parse(scanner *rfcquery.Scanner) (interface{}, error) {
        // Use scanner.CollectWhile(), scanner.NextToken(), etc.
        // Return your custom data structure
    }
    
    func (p *MyCustomParser) Name() string {
        return "my-custom-parser"
    }
    
Roadmap
  • GraphQL query parser plugin
  • TMF query parser plugin
  • Query builder API (fluent interface)
  • Streaming parser for very large queries
  • JSON Schema validation for JSON-in-query
  • Performance optimizations with pooled scanner
  • encoder package for strict rfc encoding

Contributing

We welcome contributions! Please see CONTRIBUTING for guidelines.

License

MIT License - see LICENSE file for details.

Why choose rfcquery?

✅ Correctness: Strict RFC3986 compliance ✅ Flexibility: Plugin system for any query format ✅ Performance: Bulk operations and minimal allocations ✅ Developer Experience: Clear errors with positions

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Error

type Error struct {
	Pos Position
	Msg string
}

func (*Error) Error

func (e *Error) Error() string

type Lexer

type Lexer struct {
	// contains filtered or unexported fields
}

Lexer performs RFC3986 lexical analysis on query strings

func NewLexer

func NewLexer(input string) *Lexer

NewLexer creates a new Lexer for the given query string

func (*Lexer) Decode

func (l *Lexer) Decode() (string, error)

Decode returns the fully decoded query string This performs both validation and percent-encoding

func (*Lexer) Valid

func (l *Lexer) Valid() error

Valid performs strict RFC3986 validation of the query string Returns nil if valid, or an error with position information

type Parser

type Parser interface {
	Parse(scanner *Scanner) (any, error)

	Name() string
}

Parser is the interface that all query parsers must implement

type Position

type Position struct {
	Offset int
}

type Scanner

type Scanner struct {
	// contains filtered or unexported fields
}

Scanner provides a token-by-token access to the query string

func NewScanner

func NewScanner(input string) *Scanner

NewScanner creates a new scanner for the query string

func (*Scanner) CollectAll

func (s *Scanner) CollectAll() (TokenSlice, error)

CollectAll reads all remaining tokens into a slice

func (*Scanner) CollectN

func (s *Scanner) CollectN(n int) (TokenSlice, error)

CollectN collects exactly n tokens Returns error if fewer than n tokens are available

func (*Scanner) CollectUntil

func (s *Scanner) CollectUntil(predicate func(Token) bool) (TokenSlice, error)

CollectUntil collects tokens until the preciate returns true The Token that fails the predicate is left unconsumed

func (*Scanner) CollectWhile

func (s *Scanner) CollectWhile(predicate func(Token) bool) (TokenSlice, error)

CollectWhile collects tokens while the predicate returns true The Token that fails the predicate is left unconsumed

func (*Scanner) NextToken

func (s *Scanner) NextToken() (Token, error)

Nextoken returns the next token and advances the scanner

func (*Scanner) PeekN

func (s *Scanner) PeekN(n int) (TokenSlice, error)

PeekN returns the next n tokens without consuming them

func (*Scanner) PeekToken

func (s *Scanner) PeekToken() (Token, error)

PeekToken returns the next token without advancing the scanner

func (*Scanner) Pos

func (s *Scanner) Pos() int

func (*Scanner) Reset

func (s *Scanner) Reset()

func (*Scanner) Rewind

func (s *Scanner) Rewind(n int)

func (*Scanner) SkipWhile

func (s *Scanner) SkipWhile(predicate func(Token) bool) (int, error)

SkipWhile skips tokens while the predicate returns true Returns the number of skipped tokens

func (*Scanner) Valid

func (s *Scanner) Valid() error

Valid performs full validation without tokenizing

type Token

type Token struct {
	Type    TokenType
	Value   string // Raw value (percent-encoded if present)
	Decoded string // Decoded value for percent-encoded tokens
	Start   Position
	End     Position
}

Token represents a lexical token in a query string

type TokenSlice

type TokenSlice []Token

func (TokenSlice) Bytes

func (ts TokenSlice) Bytes() []byte

Bytes returns the raw byte representation

func (TokenSlice) SplitSubDelimiter added in v0.4.0

func (ts TokenSlice) SplitSubDelimiter(del string) []TokenSlice

func (TokenSlice) String

func (ts TokenSlice) String() string

String reconstructs the original query string implemtation of Stringer interface

func (TokenSlice) StringDecoded

func (ts TokenSlice) StringDecoded() string

StringDecoded reconstructs the fully decoded query string

type TokenType

type TokenType int
const (
	TokenInvalid        TokenType = iota
	TokenPercentEncoded           // %HH sequence
	TokenUnreserved               // ALPHA / DIGIT/ - / . / _  / ~
	TokenSubDelims                // ! / $ / & / ' / ( / ) / * / + / , / ; / =
	TokenPcharOther               // : / @
	TokenPathChar                 // '/' / ?
	TokenEOF
)

func (TokenType) String

func (tt TokenType) String() string

String returns a readable representation of the token type this implement the Stringer interface

type Value

type Value struct {
	// the decoded value
	Value string

	// Whether this key was seen multiple times
	HasMultiple bool

	// Positions information for precise error report
	KeyPos   Position
	ValuePos Position

	// Original Tokens for inspection
	KeyTokens   TokenSlice
	ValueTokens TokenSlice
}

Value represent a parsed query value with metadata

type Values

type Values struct {
	// contains filtered or unexported fields
}

Values is a collection of parsed query params It preserves insertion order and allow duplicate keys

func NewValues

func NewValues() *Values

func (*Values) Add

func (v *Values) Add(key string, value Value)

Add a key-value pair

func (*Values) AllKeys

func (v *Values) AllKeys() []string

AllKeys returns all keys in insertion order

func (*Values) First

func (v *Values) First(key string) (Value, bool)

First returns the first value for a key

func (*Values) Get

func (v *Values) Get(key string) []Value

Get return all values for a key

func (*Values) Len

func (v *Values) Len() int

Len return the total number of key-value pairs

Directories

Path Synopsis
internal
percent
Package percent provides full RFC3986 percent-encoding/decoding
Package percent provides full RFC3986 percent-encoding/decoding
plugins

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL