splitter

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 16, 2025 License: MIT Imports: 17 Imported by: 0

README ΒΆ

πŸ“‚ File Splitter

Go Version Go Report Card

A robust, production-ready Go library and CLI tool for splitting large files into encrypted chunks and seamlessly merging them back together with automatic integrity verification.

Overview

File Splitter is a high-performance microservice designed to handle the chunking and reconstruction of files with enterprise-grade features including:

  • Intelligent Chunking: Split files into configurable chunk sizes (bytes to gigabytes)
  • Military-Grade Encryption: AES-256 encryption with automatic or manual key management
  • Data Integrity: Automatic SHA-256 checksum verification on merge operations
  • Thread-Safe Operations: Concurrent processing with proper synchronization
  • Atomic File Writes: Ensures data consistency even during failures
  • Context-Aware: Full support for cancellation and timeouts
  • Path Traversal Protection: Robust security against malicious file paths

✨ Features

  • πŸ“¦ Intelligent Chunking - Split files into configurable sizes (bytes to gigabytes)
  • πŸ”’ AES-256 Encryption - Optional military-grade encryption for each chunk
  • βœ… SHA-256 Integrity - Automatic checksum verification on merge
  • 🧡 Thread-Safe Operations - Safe for concurrent use with proper synchronization
  • ⚑ High Performance - Concurrent checksumming and chunking for maximum speed
  • πŸ›‘οΈ Security Hardened - Path traversal protection, null byte filtering, atomic file writes
  • πŸ“Š Progress Reporting - Real-time progress updates for long operations
  • πŸ”„ Context Support - Cancellation and timeout support for all operations
  • 🎯 Atomic Operations - Ensures data consistency even during failures
  • πŸ”‘ Flexible Key Management - Auto-generation or custom key file support
  • πŸ—‚οΈ Clean Output - Organized directory structure with sanitized filenames
  • πŸ“± Cross-Platform - Works on Linux, macOS, and Windows

πŸš€ Quick Start

Command Line Tool

The easiest way to get started is using the CLI tool for splitting and merging files:

# Clone the repository
git clone https://github.com/AlexanderEl/splitter.git
cd splitter

# Build the CLI tool
go build -o splitter cmd/splitter/main.go

# Split a file into 10MB chunks (default)
./splitter -op split -file large-video.mp4

# Merge chunks back together
./splitter -op merge -file file-data_large-video.mp4

See the CLI Usage section for complete documentation.

πŸ“¦ Installation

As a Library
go get github.com/AlexanderEl/splitter
As a CLI Tool
# Clone and build
git clone https://github.com/AlexanderEl/splitter.git
cd splitter
go build -o splitter cmd/splitter/main.go

# Optionally, install to your PATH
go install github.com/AlexanderEl/splitter/cmd/splitter@latest

πŸ–₯️ CLI Usage

The splitter comes with a professional command-line interface for easy file splitting and merging.

Building the CLI
# Clone the repository
git clone https://github.com/AlexanderEl/splitter.git
cd splitter

# Build the CLI tool
go build -o splitter cmd/splitter/main.go

# Optionally, install it to your PATH
go install github.com/AlexanderEl/splitter/cmd/splitter@latest
Command Syntax
splitter -op <operation> -file <path> [options]
Available Flags
Flag Description Required Default
-op Operation: split or merge Yes -
-file Path to file (split) or directory (merge) Yes -
-size Size of each chunk No 10
-format Chunk format: B, KB, MB, or GB No MB
-out Output directory for chunks No Current directory
-e Enable encryption/decryption No false
-key Path to encryption key file (merge) No -
-save-key Save encryption key to file (split) No true
-key-path Custom path for encryption key No encryption-key.txt
-v Verbose output with progress No false
-version Show version and exit No false
Split Examples

Basic Split - Default 10MB chunks

splitter -op split -file ~/Downloads/large-video.mp4

Custom Chunk Size - 100KB chunks

splitter -op split -file ~/Downloads/document.pdf -size 100 -format KB

With Encryption - Generates encryption key automatically

splitter -op split -file ~/Downloads/sensitive.zip -e

Encrypted Split with Custom Key Location

splitter -op split -file ~/Downloads/data.tar.gz -e -key-path /secure/my-key.txt

Verbose Output with Progress

splitter -op split -file ~/Downloads/large-file.bin -size 50 -format MB -v

Custom Output Directory

splitter -op split -file ~/Downloads/archive.tar -out ~/chunks/
Merge Examples

Basic Merge - Reconstructs original file

splitter -op merge -file file-data_large-video.mp4

Merge with Verbose Output

splitter -op merge -file file-data_document.pdf -v

Encrypted Merge - Uses key from split operation

splitter -op merge -file file-data_sensitive.zip -e -key encryption-key.txt

Merge with Custom Key Location

splitter -op merge -file file-data_archive.tar -e -key /secure/my-key.txt -v
CLI Output Behavior

Default Output Paths:

  • Split: Creates directory file-data_<filename> containing chunks
    • document.pdf β†’ file-data_document.pdf/data_00, data_01, etc.
  • Merge: Creates file with name extracted from directory
    • file-data_document.pdf β†’ document.pdf

Normal Mode (default):

$ ./splitter -op split -file test.txt -size 100 -format KB
βœ“ Split complete: test.txt β†’ ./file-data_test.txt

Verbose Mode (-v flag):

$ ./splitter -op split -file test.txt -size 100 -format KB -v

=========================================================
                                                         
    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— β–ˆβ–ˆβ•—     β–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•— 
    β–ˆβ–ˆβ•”β•β•β•β•β•β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘     β–ˆβ–ˆβ•‘β•šβ•β•β–ˆβ–ˆβ•”β•β•β•β•šβ•β•β–ˆβ–ˆβ•”β•β•β•β–ˆβ–ˆβ•”β•β•β•β•β•β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—
    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•β–ˆβ–ˆβ•‘     β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•”β•
    β•šβ•β•β•β•β–ˆβ–ˆβ•‘β–ˆβ–ˆβ•”β•β•β•β• β–ˆβ–ˆβ•‘     β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•”β•β•β•  β–ˆβ–ˆβ•”β•β•β–ˆβ–ˆβ•—
    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•‘β–ˆβ–ˆβ•‘     β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ•‘      β–ˆβ–ˆβ•‘   β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•‘
    β•šβ•β•β•β•β•β•β•β•šβ•β•     β•šβ•β•β•β•β•β•β•β•šβ•β•   β•šβ•β•      β•šβ•β•   β•šβ•β•β•β•β•β•β•β•šβ•β•  β•šβ•β•
                                                         
         πŸ“‚ Secure File Splitting Tool v1.0.0
                                                         
=========================================================

πŸ“‚ Splitting file: test.txt
   Location: /home/user/documents
   Chunk size: 100 KB

πŸ“Š Progress:
[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ] 100%

βœ“ File split successfully!
   Output directory: ./file-data_test.txt

βœ“ Operation completed successfully in 234ms!
CLI Error Messages

The CLI provides clear error messages for common issues:

# Missing input file
$ ./splitter -op split -file nonexistent.txt
❌ Error: input file does not exist: nonexistent.txt

# Invalid chunk format
$ ./splitter -op split -file test.txt -format TB
❌ Error: invalid split configuration: invalid format 'TB', must be one of: B, KB, MB, GB

# Missing encryption key for merge
$ ./splitter -op merge -file file-data_encrypted.zip -e
❌ Error: encryption key file path required (use -key flag)

# Wrong encryption key
$ ./splitter -op merge -file file-data_document.pdf -e -key wrong-key.txt
❌ Error: merge operation failed: failed to decrypt chunk 'data_0': cipher: message authentication failed

# Checksum mismatch
$ ./splitter -op merge -file file-data_corrupted.bin
❌ Error: merge operation failed: checksum mismatch: merged file 'corrupted.bin' is corrupted
CLI Best Practices
  1. Keep Your Key File Safe

    # After splitting with encryption, backup your key
    cp encryption-key.txt ~/secure-backup/
    
    # Set restrictive permissions
    chmod 600 encryption-key.txt
    
  2. Batch Processing

    # Split all large files in a directory
    for file in *.mp4; do
        ./splitter -op split -file "$file" -size 50 -format MB
    done
    
    # Merge all split directories
    for dir in file-data_*; do
        ./splitter -op merge -file "$dir" -v
    done
    
  3. Use Different Keys for Different Projects

    # Project A
    ./splitter -op split -file project-a-data.zip -e -key-path keys/project-a.txt
    
    # Project B
    ./splitter -op split -file project-b-data.zip -e -key-path keys/project-b.txt
    
  4. Verify Split Integrity

    # Split a file
    ./splitter -op split -file important.bin -v
    
    # Merge and verify
    ./splitter -op merge -file file-data_important.bin -v
    
    # Compare checksums (merge automatically verifies)
    sha256sum important.bin
    sha256sum important.bin  # From merged file
    
Integration with Scripts
#!/bin/bash
# backup-and-split.sh

BACKUP_FILE="/data/large-database-backup.sql"
KEY_FILE="$HOME/.secrets/backup-key.txt"

# Split the backup into manageable chunks
./splitter -op split \
    -file "$BACKUP_FILE" \
    -size 100 \
    -format MB \
    -e \
    -key-path "$KEY_FILE" \
    -v

# Move chunks to backup location
mv file-data_large-database-backup.sql /backup/$(date +%Y%m%d)/

echo "Backup split and encrypted successfully!"

πŸ’» Programmatic Usage

Basic Split Example
package main

import (
    "context"
    "fmt"
    "log"

    "github.com/AlexanderEl/splitter"
)

func main() {
    // Create splitter instance
    s, err := splitter.NewSplit(
        "/path/to/directory",
        "large-file.bin",
        true, // verbose
        &splitter.EncryptionConfig{
            IsEncrypted: false,
        },
    )
    if err != nil {
        log.Fatal(err)
    }

    // Configure split parameters
    configs := splitter.SplitConfigs{
        ChunkSize: 50,      // 50 MB chunks
        Format:    "MB",
    }

    // Execute split operation
    ctx := context.Background()
    if err := s.Split(ctx, configs); err != nil {
        log.Fatal(err)
    }

    fmt.Println("File split successfully!")
}
Split with Encryption
s, err := splitter.NewSplit(
    "/path/to/directory",
    "sensitive-data.zip",
    true,
    &splitter.EncryptionConfig{
        IsEncrypted: true,
        WriteToFile: true,
        KeyFilePath: "/secure/location/encryption.key",
    },
)
if err != nil {
    log.Fatal(err)
}

configs := splitter.SplitConfigs{
    ChunkSize: 10,
    Format:    "MB",
}

ctx := context.Background()
if err := s.Split(ctx, configs); err != nil {
    log.Fatal(err)
}

// Export key for later use
key, err := s.ExportEncryptionKey()
if err != nil {
    log.Fatal(err)
}
fmt.Printf("Encryption key: %x\n", key)
Merge Example
merger, err := splitter.NewSplit(
    "/path/to/chunks/file-data_large-file.bin",
    "reconstructed.bin",
    true,
    &splitter.EncryptionConfig{
        IsEncrypted: false,
    },
)
if err != nil {
    log.Fatal(err)
}

ctx := context.Background()
if err := merger.Merge(ctx); err != nil {
    log.Fatal(err)
}

fmt.Println("File merged successfully with checksum verification!")
Merge with Encryption
merger, err := splitter.NewSplit(
    "/path/to/chunks/file-data_sensitive-data.zip",
    "decrypted.zip",
    true,
    &splitter.EncryptionConfig{
        IsEncrypted: true,
    },
)
if err != nil {
    log.Fatal(err)
}

// Load encryption key
if err := merger.SetEncryption("/secure/location/encryption.key"); err != nil {
    log.Fatal(err)
}

ctx := context.Background()
if err := merger.Merge(ctx); err != nil {
    log.Fatal(err)
}
Context Cancellation
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()

if err := splitter.Split(ctx, configs); err != nil {
    if errors.Is(err, context.DeadlineExceeded) {
        log.Println("Operation timed out")
    } else {
        log.Fatal(err)
    }
}

πŸ“– API Documentation

Core Types
Splitter Interface

Defines operations for splitting and merging files.

type Splitter interface {
    Split(ctx context.Context, configs SplitConfigs) error
    SetEncryption(filePath string) error
    ExportEncryptionKey() ([]byte, error)
    Merge(ctx context.Context) error
}
Split Struct

Main implementation of the Splitter interface.

type Split struct {
    FileName       string
    FilePath       string
    OutputDir      string
    ProgressFunc   ProgressFunc
    Verbose        bool
    CleanupOnError bool
    MaxFileSize    int64
    MaxChunks      int64
}
Constructor
NewSplit(filePath, fileName string, verbose bool, config *EncryptionConfig) (*Split, error)

Creates a new Split instance with specified configuration.

Parameters:

  • filePath - Directory containing the file (split) or directory with chunks (merge)
  • fileName - Name of the file to split or output filename for merge
  • verbose - Enable progress reporting
  • config - Encryption configuration

Returns: Split instance or error

Example:

splitter, err := splitter.NewSplit(
    "/path/to/directory",
    "large-file.bin",
    true, // verbose
    &splitter.EncryptionConfig{
        IsEncrypted: false,
    },
)
Configuration Types
SplitConfigs

Configuration for split operations.

type SplitConfigs struct {
    ChunkSize uint   // Size of each chunk
    Format    string // Format: "B", "KB", "MB", "GB"
}

Methods:

  • Validate() error - Validates the configuration

Example:

configs := splitter.SplitConfigs{
    ChunkSize: 50,
    Format:    "MB",
}

if err := configs.Validate(); err != nil {
    log.Fatal(err)
}
EncryptionConfig

Configuration for encryption settings.

type EncryptionConfig struct {
    IsEncrypted bool   // Enable encryption
    WriteToFile bool   // Write key to file
    KeyFilePath string // Path for key file
}

Example:

encConfig := &splitter.EncryptionConfig{
    IsEncrypted: true,
    WriteToFile: true,
    KeyFilePath: "/secure/location/key.txt",
}
Core Methods
Split(ctx context.Context, configs SplitConfigs) error

Splits a file into chunks based on configuration.

Parameters:

  • ctx - Context for cancellation/timeout
  • configs - Split configuration (chunk size and format)

Returns: Error if operation fails

Features:

  • Validates file path and configuration
  • Creates output directory with sanitized name
  • Splits file into chunks with optional encryption
  • Generates SHA-256 checksum file
  • Concurrent checksumming for performance
  • Automatic cleanup on error (if enabled)

Example:

ctx := context.Background()
configs := splitter.SplitConfigs{
    ChunkSize: 10,
    Format:    "MB",
}

if err := splitter.Split(ctx, configs); err != nil {
    log.Fatal(err)
}
Merge(ctx context.Context) error

Merges split chunks back into original file with integrity verification.

Parameters:

  • ctx - Context for cancellation/timeout

Returns: Error if operation fails

Features:

  • Validates directory structure
  • Reads and orders chunks sequentially
  • Optional decryption of chunks
  • SHA-256 checksum verification
  • Atomic file creation (temp + rename)
  • Automatic cleanup of temp files on error

Example:

merger, _ := splitter.NewSplit(
    "/path/to/file-data_myfile",
    "restored-file.bin",
    true,
    &splitter.EncryptionConfig{IsEncrypted: false},
)

ctx := context.Background()
if err := merger.Merge(ctx); err != nil {
    log.Fatal(err)
}
SetEncryption(filePath string) error

Loads encryption key from file for merge operations.

Parameters:

  • filePath - Path to encryption key file

Returns: Error if key cannot be loaded

Example:

if err := merger.SetEncryption("/path/to/encryption-key.txt"); err != nil {
    log.Fatal(err)
}
ExportEncryptionKey() ([]byte, error)

Exports the current encryption key for backup or sharing.

Returns: Encryption key bytes or error

Example:

key, err := splitter.ExportEncryptionKey()
if err != nil {
    log.Fatal(err)
}

// Save to secure location
ioutil.WriteFile("/secure/backup-key.txt", key, 0600)
Helper Types
ProgressFunc

Function type for progress reporting callbacks.

type ProgressFunc func(current, total int64)

Example:

customProgress := func(current, total int64) {
    percent := float64(current) / float64(total) * 100
    fmt.Printf("\rProgress: %.2f%%", percent)
}

splitter.ProgressFunc = customProgress
Constants
const (
    OutputDirPrefix        = "file-data_"
    ChecksumFileName       = "checksum.sha256"
    DefaultDirPermissions  = 0755
    DefaultFilePermissions = 0644
    MinChunkSize          = 1024      // 1 KB minimum
    MaxFileSize           = 107374182400 // 100 GB default
    MaxChunks             = 10000     // Maximum chunks per file
)
Error Types
var (
    ErrInvalidPath              = errors.New("invalid file path")
    ErrDirectoryExists          = errors.New("output directory already exists")
    ErrFileTooLarge            = errors.New("file exceeds maximum size")
    ErrTooManyChunks           = errors.New("would create too many chunks")
    ErrEmptyDirectory          = errors.New("directory is empty")
    ErrChecksumMismatch        = errors.New("checksum mismatch")
    ErrEncryptionNotInitialized = errors.New("encryption service not initialized")
)
Error Handling

The library uses sentinel errors for clear error handling:

err := splitter.Split(ctx, configs)

if errors.Is(err, splitter.ErrDirectoryExists) {
    // Handle existing directory
    log.Println("Output directory already exists")
} else if errors.Is(err, splitter.ErrFileTooLarge) {
    // Handle file size limit
    log.Println("File is too large to split")
} else if errors.Is(err, splitter.ErrChecksumMismatch) {
    // Handle corruption
    log.Println("Merged file is corrupted")
}

🎯 Configuration Options

Parameter Type Description Valid Values
ChunkSize uint Size of each chunk Any positive integer
Format string Unit for chunk size B, KB, MB, GB
Split Configuration
Parameter Type Description Valid Values
ChunkSize uint Size of each chunk Any positive integer
Format string Unit for chunk size B, KB, MB, GB
Encryption Configuration
Parameter Type Description Default
IsEncrypted bool Enable encryption false
WriteToFile bool Save key to file false
KeyFilePath string Path for key file ./encryption-key.txt
Splitter Options
Option Type Description Default
Verbose bool Enable progress reporting false
CleanupOnError bool Remove partial files on failure true
MaxFileSize int64 Maximum file size to process 100 GB
MaxChunks int64 Maximum number of chunks 10,000
OutputDir string Directory for output files Current directory
ProgressFunc ProgressFunc Custom progress callback Default progress bar

πŸ” Security Features

Encryption
  • AES-256-GCM - Industry-standard authenticated encryption with associated data (AEAD)
  • Unique Keys - Each split operation generates a unique encryption key
  • Automatic Key Management - Keys auto-generated or loaded from files
  • Thread-Safe Encryption - All encryption operations are thread-safe
Data Integrity
  • SHA-256 Checksums - Automatic checksum generation and verification
  • Constant-Time Comparison - Prevents timing attacks on checksum validation
  • Atomic File Operations - Temp file + rename ensures consistency
Path Security
  • Path Traversal Protection - Validates and cleans all file paths
  • Null Byte Filtering - Prevents null byte injection attacks
  • Filename Sanitization - Removes dangerous characters from filenames
  • Directory Validation - Ensures operations stay within intended directories
Operational Security
  • Context Support - Proper cancellation prevents resource leaks
  • Automatic Cleanup - Failed operations clean up partial files
  • Secure File Permissions - Key files created with 0600 permissions
  • Memory Safety - Thread-safe operations with proper locking
Best Practices

βœ… Always verify checksums after merge operations
βœ… Store encryption keys in secure locations with restricted permissions
βœ… Never commit encryption keys to version control
βœ… Use unique keys for different files or projects
βœ… Enable verbose mode for critical operations
βœ… Implement regular key backup procedures
βœ… Test recovery procedures before production use
βœ… Use context timeouts for long-running operations

Architecture

Split Process
  1. Validation: Validates file path, chunk size, and configuration
  2. Preparation: Creates output directory and calculates chunk count
  3. Concurrent Operations:
    • Chunk Creation: Reads and writes file chunks with optional encryption
    • Checksum Generation: Computes SHA-256 hash in parallel
  4. Atomic Writes: Uses temporary files and rename for atomicity
  5. Progress Reporting: Reports progress at configurable intervals
Merge Process
  1. Validation: Verifies directory structure and encryption setup
  2. Discovery: Identifies and orders data chunks
  3. Sequential Reconstruction: Reads chunks with optional decryption
  4. Checksum Verification: Validates reconstructed file integrity
  5. Atomic Completion: Renames temp file only after successful verification
Security Model
  • Path Sanitization: All file paths are cleaned and validated
  • Encryption: AES-256-GCM with unique keys per split operation
  • Integrity: SHA-256 checksums protect against corruption
  • Constant-Time Comparison: Prevents timing attacks on checksums
  • Thread Safety: All encryption operations are thread-safe

⚑ Performance

Quick Reference:

Operation Speed Throughput Memory
Split 682 Β΅s/op ~1.47K ops/sec 154 KB/op
Merge 2.08 ms/op ~481 ops/sec 1.14 MB/op
Split (Encrypted) 887 Β΅s/op ~1.13K ops/sec 1.27 MB/op

Benchmarks on Intel Core i9-14900HX (32 threads) - averaged over 20 runs:

TEST_NAME                          ITERATIONS  AVG_ITER_DURATION    MEMORY_USED  NUM_MEMORY_ALLOCATIONS
BenchmarkSplit-32                        1690        682496 ns/op    154186 B/op       240 allocs/op
BenchmarkMerge-32                         588       2078640 ns/op   1138959 B/op       135 allocs/op
BenchmarkSplitWithEncryption-32          1379        887136 ns/op   1269816 B/op       294 allocs/op

Performance Highlights:

  • ⚑ Split: ~682 Β΅s/op (~1,466 operations/sec) - Fast chunking with concurrent checksumming
  • ⚑ Merge: ~2.08 ms/op (~481 operations/sec) - Includes full integrity verification
  • πŸ” Encrypted Split: ~887 Β΅s/op (~1,127 operations/sec) - AES-256 encryption overhead ~30%
  • πŸ’Ύ Memory Efficient: 154 KB for split, 1.14 MB for merge operations
  • 🧡 Thread-Safe: All operations safe for concurrent use
  • πŸ“Š Consistent: Results averaged over 20 benchmark runs for reliability

Note: Benchmarks measure individual operation cycles, not total file processing time. Actual throughput depends on file size, chunk size, storage speed, and system resources.

Optimizations
  • Concurrent Operations - Checksumming and chunking run in parallel
  • Buffer Reuse - Single buffer reused across all chunks
  • Progress Throttling - Progress updates throttled for high-chunk-count operations
  • Efficient I/O - Direct file I/O without intermediate buffers
  • Atomic Writes - Temporary files + rename for consistency
Performance Tips
  1. Choose Appropriate Chunk Sizes

    // For network transfer - smaller chunks (10-50 MB)
    configs := SplitConfigs{ChunkSize: 10, Format: "MB"}
    
    // For local storage - larger chunks (100-500 MB)
    configs := SplitConfigs{ChunkSize: 100, Format: "MB"}
    
  2. Disable Progress for Maximum Speed

    splitter.Verbose = false  // No progress reporting overhead
    
  3. Use Context Timeouts for Large Files

    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
    defer cancel()
    

πŸ§ͺ Testing

The splitter package includes comprehensive testing with 82%+ test coverage.

Basic Testing
# Run all tests
go test -v

# Run tests with coverage
go test -cover

# Generate detailed coverage report
go test -coverprofile=coverage.out
go tool cover -html=coverage.out

# Run specific test
go test -v -run TestSplitAndMerge

# Run benchmarks
go test -bench=. -benchmem
Test Categories
  • βœ… Unit tests for all public APIs
  • βœ… Integration tests for split/merge workflows
  • βœ… Concurrency tests for thread safety
  • βœ… Fuzz tests for input validation
  • βœ… Benchmark tests for performance
  • βœ… Error injection tests for resilience
  • βœ… Encryption/decryption roundtrip tests
  • βœ… Path security validation tests
Running Specific Test Categories
# Run encryption tests
go test -v -run TestSplitAndMergeWithEncryption

# Run cancellation tests
go test -v -run TestCancellation

# Run validation tests
go test -v -run TestValidate

# Run fuzz tests (requires Go 1.18+)
go test -fuzz=FuzzValidateAndPreparePath -fuzztime=30s
Continuous Integration
# Complete CI test suite
go test -cover -v -timeout 30s

# With JSON output for CI tools
go test -cover -json > test-results.json
Expected Test Results

When all tests pass, you should see:

PASS
coverage: 82%+ of statements
ok      github.com/AlexanderEl/splitter    15.234s

πŸ“‹ Examples

Example 1: Basic File Split and Merge
package main

import (
    "context"
    "log"
    
    "github.com/AlexanderEl/splitter"
)

func main() {
    // Create splitter
    s, err := splitter.NewSplit(
        "/home/user/downloads",
        "large-video.mp4",
        true, // verbose
        &splitter.EncryptionConfig{
            IsEncrypted: false,
        },
    )
    if err != nil {
        log.Fatal(err)
    }
    
    // Configure split
    configs := splitter.SplitConfigs{
        ChunkSize: 50,
        Format:    "MB",
    }
    
    // Split file
    ctx := context.Background()
    if err := s.Split(ctx, configs); err != nil {
        log.Fatal(err)
    }
    
    log.Println("File split successfully!")
    
    // Later: Merge back
    merger, err := splitter.NewSplit(
        "/home/user/downloads/file-data_large-video.mp4",
        "restored-video.mp4",
        true,
        &splitter.EncryptionConfig{IsEncrypted: false},
    )
    if err != nil {
        log.Fatal(err)
    }
    
    if err := merger.Merge(ctx); err != nil {
        log.Fatal(err)
    }
    
    log.Println("File merged and verified successfully!")
}
Example 2: Encrypted Split with Key Export
package main

import (
    "context"
    "io/ioutil"
    "log"
    
    "github.com/AlexanderEl/splitter"
)

func encryptedSplit() {
    s, err := splitter.NewSplit(
        "/sensitive",
        "confidential.zip",
        true,
        &splitter.EncryptionConfig{
            IsEncrypted: true,
            WriteToFile: true,
            KeyFilePath: "/secure/backup-key.txt",
        },
    )
    if err != nil {
        log.Fatal(err)
    }
    
    configs := splitter.SplitConfigs{
        ChunkSize: 10,
        Format:    "MB",
    }
    
    ctx := context.Background()
    if err := s.Split(ctx, configs); err != nil {
        log.Fatal(err)
    }
    
    // Export key for backup
    key, err := s.ExportEncryptionKey()
    if err != nil {
        log.Fatal(err)
    }
    
    // Save to additional secure location
    if err := ioutil.WriteFile("/backup/key-backup.txt", key, 0600); err != nil {
        log.Fatal(err)
    }
    
    log.Println("File split and encrypted. Key backed up.")
}
Example 3: Context with Timeout
package main

import (
    "context"
    "errors"
    "log"
    "time"
    
    "github.com/AlexanderEl/splitter"
)

func splitWithTimeout() {
    s, _ := splitter.NewSplit(
        "/data",
        "huge-file.bin",
        true,
        &splitter.EncryptionConfig{IsEncrypted: false},
    )
    
    configs := splitter.SplitConfigs{
        ChunkSize: 100,
        Format:    "MB",
    }
    
    // Set 5 minute timeout
    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
    defer cancel()
    
    if err := s.Split(ctx, configs); err != nil {
        if errors.Is(err, context.DeadlineExceeded) {
            log.Println("Operation timed out after 5 minutes")
        } else {
            log.Fatal(err)
        }
    }
}
Example 4: Concurrent Splitting (Multiple Files)
package main

import (
    "context"
    "log"
    "sync"
    
    "github.com/AlexanderEl/splitter"
)

func splitMultipleFiles(files []string) {
    var wg sync.WaitGroup
    errChan := make(chan error, len(files))
    
    for _, file := range files {
        wg.Add(1)
        go func(f string) {
            defer wg.Done()
            
            s, err := splitter.NewSplit(
                "/data",
                f,
                false, // no verbose for concurrent ops
                &splitter.EncryptionConfig{IsEncrypted: false},
            )
            if err != nil {
                errChan <- err
                return
            }
            
            configs := splitter.SplitConfigs{
                ChunkSize: 50,
                Format:    "MB",
            }
            
            ctx := context.Background()
            if err := s.Split(ctx, configs); err != nil {
                errChan <- err
            }
        }(file)
    }
    
    wg.Wait()
    close(errChan)
    
    // Check for errors
    for err := range errChan {
        log.Printf("Error splitting file: %v", err)
    }
}
Example 5: Custom Progress Callback
package main

import (
    "context"
    "fmt"
    "log"
    
    "github.com/AlexanderEl/splitter"
)

func splitWithCustomProgress() {
    s, _ := splitter.NewSplit(
        "/data",
        "file.bin",
        true,
        &splitter.EncryptionConfig{IsEncrypted: false},
    )
    
    // Custom progress function
    s.ProgressFunc = func(current, total int64) {
        percent := float64(current) / float64(total) * 100
        fmt.Printf("\r[%-50s] %.2f%% (%d/%d chunks)",
            progressBar(int(percent)),
            percent,
            current,
            total)
    }
    
    configs := splitter.SplitConfigs{
        ChunkSize: 10,
        Format:    "MB",
    }
    
    ctx := context.Background()
    if err := s.Split(ctx, configs); err != nil {
        log.Fatal(err)
    }
    
    fmt.Println() // New line after progress
}

func progressBar(percent int) string {
    filled := percent / 2 // Scale to 50 chars
    bar := ""
    for i := 0; i < 50; i++ {
        if i < filled {
            bar += "="
        } else {
            bar += " "
        }
    }
    return bar
}
Example 6: Merge with Verification
package main

import (
    "context"
    "crypto/sha256"
    "fmt"
    "io/ioutil"
    "log"
    
    "github.com/AlexanderEl/splitter"
)

func mergeAndVerify(originalFile, mergedFile string) {
    // Merge chunks
    merger, _ := splitter.NewSplit(
        "file-data_myfile",
        "restored.bin",
        true,
        &splitter.EncryptionConfig{IsEncrypted: false},
    )
    
    ctx := context.Background()
    if err := merger.Merge(ctx); err != nil {
        log.Fatal(err)
    }
    
    // Additional verification: Compare with original
    original, err := ioutil.ReadFile(originalFile)
    if err != nil {
        log.Fatal(err)
    }
    
    merged, err := ioutil.ReadFile(mergedFile)
    if err != nil {
        log.Fatal(err)
    }
    
    originalHash := sha256.Sum256(original)
    mergedHash := sha256.Sum256(merged)
    
    if originalHash == mergedHash {
        fmt.Println("βœ“ Verification successful: Files are identical")
    } else {
        fmt.Println("βœ— Verification failed: Files differ")
    }
}

⚠️ Security Considerations

When to Use This Library

βœ… Splitting large files for storage or transfer
βœ… Encrypting sensitive file chunks
βœ… Creating portable file archives
βœ… Distributing large files across multiple drives
βœ… Uploading large files to cloud storage with size limits
βœ… Backing up large databases in manageable chunks
βœ… Multi-threaded/concurrent file processing

When NOT to Use This Library

❌ Real-time streaming (use streaming protocols)
❌ Network file transfer (use rsync, FTP, etc.)
❌ Database backups (use database-specific tools)
❌ Version control (use Git LFS or similar)
❌ As a primary encryption tool (use dedicated encryption software)

Important Security Notes
  • Key Management - Encryption keys are as sensitive as the data they protect. Never commit keys to version control
  • Key Storage - Store keys with 0600 permissions in secure locations
  • Checksum Verification - Always verify checksums after merge operations
  • Threat Model - This library protects data at rest, not data in motion or in memory
  • File Permissions - Ensure proper permissions on output directories
  • Cleanup - Failed operations automatically clean up, but verify manually if needed
  • Context Cancellation - Always use context timeouts for long-running operations

🚫 Limitations

  • Maximum file size: 100 GB (configurable via MaxFileSize)
  • Maximum chunks: 10,000 per file (configurable via MaxChunks)
  • Minimum chunk size: 1 KB (configurable via MinChunkSize)
  • Supported platforms: Linux, macOS, Windows (amd64, arm64)
  • File system: Requires sufficient disk space for chunks (1.5x original file size during operations)
  • Concurrent operations: File-level locking not implemented (avoid splitting same file twice)

Dependencies

🀝 Contributing

Contributions are welcome! Please follow these guidelines:

Development Setup
  1. Fork the repository
  2. Clone your fork: git clone https://github.com/YOUR_USERNAME/splitter.git
  3. Create a feature branch: git checkout -b feature/amazing-feature
  4. Make your changes and add tests
  5. Run tests: go test -v
  6. Ensure coverage stays above 80%: go test -cover
  7. Run benchmarks if performance-critical: go test -bench=. -benchmem
  8. Commit your changes: git commit -m 'Add amazing feature'
  9. Push to the branch: git push origin feature/amazing-feature
  10. Open a Pull Request
Code Standards
  • Follow Effective Go guidelines
  • Run gofmt and go vet before committing
  • Add tests for new functionality
  • Maintain test coverage above 80%
  • Document all exported functions and types
  • Update README for API changes
  • Provide examples for new features
Testing Requirements

All pull requests must:

  • βœ… Pass all existing tests
  • βœ… Maintain or improve test coverage (80%+ required)
  • βœ… Include tests for new functionality
  • βœ… Pass go vet and golint checks
  • βœ… Performance-critical changes must include benchmark results from 10+ runs
  • βœ… Document any performance changes >10% with justification
Pull Request Process
  1. Update README.md with details of changes
  2. Update CHANGELOG.md (if exists) with notable changes
  3. Ensure all tests pass and coverage is maintained
  4. Request review from maintainers
  5. Address any feedback or requested changes

πŸ—ΊοΈ Roadmap

  • Support for compression (gzip, zstd, lz4)
  • Streaming support for large files to reduce memory usage
  • Resume capability for interrupted split/merge operations
  • Cloud storage integration (S3, GCS, Azure Blob)
  • Parallel chunk processing for improved performance
  • Web UI for file management
  • Docker containerization
  • Native support for archiving multiple files
  • Incremental backup support
  • Reed-Solomon error correction codes
  • Multi-volume archive support

πŸ“Š Version History

v1.0.0 (Current)
  • ✨ Initial release with file splitting and merging
  • πŸ”’ AES-256 encryption support
  • βœ… SHA-256 checksum verification
  • 🧡 Thread-safe operations
  • ⚑ Concurrent checksumming
  • πŸ›‘οΈ Path traversal protection
  • πŸ“Š Progress reporting
  • πŸ”„ Context support for cancellation
  • πŸ§ͺ Comprehensive test suite (82%+ coverage)
  • πŸ“– Complete API documentation
  • πŸ–₯️ Professional CLI tool

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Uses Go's excellent crypto/aes and crypto/sha256 packages
  • Encryption powered by github.com/AlexanderEl/encryptor
  • Inspired by common file splitting utilities (split, 7zip, HJSplit)
  • Built with Go's excellent concurrency primitives

πŸ“ž Support

πŸ‘¨β€πŸ’» Authors


Made with ❀️ for the Go community

If you find this library useful, please consider giving it a ⭐ on GitHub!

Note: This is production-grade software, but always test thoroughly with your specific use cases before deploying to production environments.

Documentation ΒΆ

Index ΒΆ

Constants ΒΆ

This section is empty.

Variables ΒΆ

This section is empty.

Functions ΒΆ

This section is empty.

Types ΒΆ

type EncryptionConfig ΒΆ

type EncryptionConfig struct {
	IsEncrypted bool   // Flag for whether to use encryption
	WriteToFile bool   // Flag for whether to write passkey to file
	KeyFilePath string // Path to passkey file location
}

EncryptionConfig defines configuration for enabling encryption

type ProgressFunc ΒΆ

type ProgressFunc func(current, total int64)

ProgressFunc is called to report progress

type Split ΒΆ

type Split struct {
	FileName       string
	FilePath       string
	OutputDir      string       // Output directory for where to split/merge files
	ProgressFunc   ProgressFunc // The progress bar display func
	Verbose        bool         // Flag for whether to add verbosity to output
	CleanupOnError bool         // Flag for cleaning up temporary files during split/merge failure
	MaxFileSize    int64        // Maximum file size to process
	MaxChunks      int64        // Maximum number of chunks
	// contains filtered or unexported fields
}

Split represents a file splitting operation

func NewSplit ΒΆ

func NewSplit(filePath, fileName string,
	verbose bool, config *EncryptionConfig) (*Split, error)

NewSplit creates a new Split instance with defaults

func (*Split) ExportEncryptionKey ΒΆ

func (s *Split) ExportEncryptionKey() ([]byte, error)

ExportEncryptionKey exports the encryption key for later use

func (*Split) Merge ΒΆ

func (s *Split) Merge(ctx context.Context) error

Merge combines split files back into a single file

func (*Split) SetEncryption ΒΆ

func (s *Split) SetEncryption(filePath string) error

SetEncryption initializes encryption from a key file

func (*Split) Split ΒΆ

func (s *Split) Split(ctx context.Context, configs SplitConfigs) error

Split splits the file into chunks

type SplitConfigs ΒΆ

type SplitConfigs struct {
	ChunkSize uint   // Size of the file chunk we want to split into
	Format    string // Format of each chunk (K, KB, MB, GB)
}

SplitConfigs defines configuration for splitting operations

func (*SplitConfigs) Validate ΒΆ

func (sc *SplitConfigs) Validate() error

Validate checks if SplitConfigs are valid

type Splitter ΒΆ

type Splitter interface {
	// Split the input file based on given configs
	Split(ctx context.Context, configs SplitConfigs) error

	// Set the encryption service to be used in Splitter
	SetEncryption(filePath string) error

	// Export the encryption key used by Splitter
	ExportEncryptionKey() ([]byte, error)

	// Merge files based on configs
	Merge(ctx context.Context) error
}

Splitter defines operations for splitting and merging files

Directories ΒΆ

Path Synopsis
cmd
splitter command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL