README
ΒΆ
π File Splitter
A robust, production-ready Go library and CLI tool for splitting large files into encrypted chunks and seamlessly merging them back together with automatic integrity verification.
Overview
File Splitter is a high-performance microservice designed to handle the chunking and reconstruction of files with enterprise-grade features including:
- Intelligent Chunking: Split files into configurable chunk sizes (bytes to gigabytes)
- Military-Grade Encryption: AES-256 encryption with automatic or manual key management
- Data Integrity: Automatic SHA-256 checksum verification on merge operations
- Thread-Safe Operations: Concurrent processing with proper synchronization
- Atomic File Writes: Ensures data consistency even during failures
- Context-Aware: Full support for cancellation and timeouts
- Path Traversal Protection: Robust security against malicious file paths
β¨ Features
- π¦ Intelligent Chunking - Split files into configurable sizes (bytes to gigabytes)
- π AES-256 Encryption - Optional military-grade encryption for each chunk
- β SHA-256 Integrity - Automatic checksum verification on merge
- π§΅ Thread-Safe Operations - Safe for concurrent use with proper synchronization
- β‘ High Performance - Concurrent checksumming and chunking for maximum speed
- π‘οΈ Security Hardened - Path traversal protection, null byte filtering, atomic file writes
- π Progress Reporting - Real-time progress updates for long operations
- π Context Support - Cancellation and timeout support for all operations
- π― Atomic Operations - Ensures data consistency even during failures
- π Flexible Key Management - Auto-generation or custom key file support
- ποΈ Clean Output - Organized directory structure with sanitized filenames
- π± Cross-Platform - Works on Linux, macOS, and Windows
π Quick Start
Command Line Tool
The easiest way to get started is using the CLI tool for splitting and merging files:
# Clone the repository
git clone https://github.com/AlexanderEl/splitter.git
cd splitter
# Build the CLI tool
go build -o splitter cmd/splitter/main.go
# Split a file into 10MB chunks (default)
./splitter -op split -file large-video.mp4
# Merge chunks back together
./splitter -op merge -file file-data_large-video.mp4
See the CLI Usage section for complete documentation.
π¦ Installation
As a Library
go get github.com/AlexanderEl/splitter
As a CLI Tool
# Clone and build
git clone https://github.com/AlexanderEl/splitter.git
cd splitter
go build -o splitter cmd/splitter/main.go
# Optionally, install to your PATH
go install github.com/AlexanderEl/splitter/cmd/splitter@latest
π₯οΈ CLI Usage
The splitter comes with a professional command-line interface for easy file splitting and merging.
Building the CLI
# Clone the repository
git clone https://github.com/AlexanderEl/splitter.git
cd splitter
# Build the CLI tool
go build -o splitter cmd/splitter/main.go
# Optionally, install it to your PATH
go install github.com/AlexanderEl/splitter/cmd/splitter@latest
Command Syntax
splitter -op <operation> -file <path> [options]
Available Flags
| Flag | Description | Required | Default |
|---|---|---|---|
-op |
Operation: split or merge |
Yes | - |
-file |
Path to file (split) or directory (merge) | Yes | - |
-size |
Size of each chunk | No | 10 |
-format |
Chunk format: B, KB, MB, or GB | No | MB |
-out |
Output directory for chunks | No | Current directory |
-e |
Enable encryption/decryption | No | false |
-key |
Path to encryption key file (merge) | No | - |
-save-key |
Save encryption key to file (split) | No | true |
-key-path |
Custom path for encryption key | No | encryption-key.txt |
-v |
Verbose output with progress | No | false |
-version |
Show version and exit | No | false |
Split Examples
Basic Split - Default 10MB chunks
splitter -op split -file ~/Downloads/large-video.mp4
Custom Chunk Size - 100KB chunks
splitter -op split -file ~/Downloads/document.pdf -size 100 -format KB
With Encryption - Generates encryption key automatically
splitter -op split -file ~/Downloads/sensitive.zip -e
Encrypted Split with Custom Key Location
splitter -op split -file ~/Downloads/data.tar.gz -e -key-path /secure/my-key.txt
Verbose Output with Progress
splitter -op split -file ~/Downloads/large-file.bin -size 50 -format MB -v
Custom Output Directory
splitter -op split -file ~/Downloads/archive.tar -out ~/chunks/
Merge Examples
Basic Merge - Reconstructs original file
splitter -op merge -file file-data_large-video.mp4
Merge with Verbose Output
splitter -op merge -file file-data_document.pdf -v
Encrypted Merge - Uses key from split operation
splitter -op merge -file file-data_sensitive.zip -e -key encryption-key.txt
Merge with Custom Key Location
splitter -op merge -file file-data_archive.tar -e -key /secure/my-key.txt -v
CLI Output Behavior
Default Output Paths:
- Split: Creates directory
file-data_<filename>containing chunksdocument.pdfβfile-data_document.pdf/data_00,data_01, etc.
- Merge: Creates file with name extracted from directory
file-data_document.pdfβdocument.pdf
Normal Mode (default):
$ ./splitter -op split -file test.txt -size 100 -format KB
β Split complete: test.txt β ./file-data_test.txt
Verbose Mode (-v flag):
$ ./splitter -op split -file test.txt -size 100 -format KB -v
=========================================================
βββββββββββββββ βββ ββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββ βββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββ βββ βββ βββ ββββββ ββββββββ
βββββββββββββββ βββ βββ βββ βββ ββββββ ββββββββ
βββββββββββ βββββββββββ βββ βββ βββββββββββ βββ
βββββββββββ βββββββββββ βββ βββ βββββββββββ βββ
π Secure File Splitting Tool v1.0.0
=========================================================
π Splitting file: test.txt
Location: /home/user/documents
Chunk size: 100 KB
π Progress:
[ββββββββββββββββββββββββββββββββββββββββ] 100%
β File split successfully!
Output directory: ./file-data_test.txt
β Operation completed successfully in 234ms!
CLI Error Messages
The CLI provides clear error messages for common issues:
# Missing input file
$ ./splitter -op split -file nonexistent.txt
β Error: input file does not exist: nonexistent.txt
# Invalid chunk format
$ ./splitter -op split -file test.txt -format TB
β Error: invalid split configuration: invalid format 'TB', must be one of: B, KB, MB, GB
# Missing encryption key for merge
$ ./splitter -op merge -file file-data_encrypted.zip -e
β Error: encryption key file path required (use -key flag)
# Wrong encryption key
$ ./splitter -op merge -file file-data_document.pdf -e -key wrong-key.txt
β Error: merge operation failed: failed to decrypt chunk 'data_0': cipher: message authentication failed
# Checksum mismatch
$ ./splitter -op merge -file file-data_corrupted.bin
β Error: merge operation failed: checksum mismatch: merged file 'corrupted.bin' is corrupted
CLI Best Practices
-
Keep Your Key File Safe
# After splitting with encryption, backup your key cp encryption-key.txt ~/secure-backup/ # Set restrictive permissions chmod 600 encryption-key.txt -
Batch Processing
# Split all large files in a directory for file in *.mp4; do ./splitter -op split -file "$file" -size 50 -format MB done # Merge all split directories for dir in file-data_*; do ./splitter -op merge -file "$dir" -v done -
Use Different Keys for Different Projects
# Project A ./splitter -op split -file project-a-data.zip -e -key-path keys/project-a.txt # Project B ./splitter -op split -file project-b-data.zip -e -key-path keys/project-b.txt -
Verify Split Integrity
# Split a file ./splitter -op split -file important.bin -v # Merge and verify ./splitter -op merge -file file-data_important.bin -v # Compare checksums (merge automatically verifies) sha256sum important.bin sha256sum important.bin # From merged file
Integration with Scripts
#!/bin/bash
# backup-and-split.sh
BACKUP_FILE="/data/large-database-backup.sql"
KEY_FILE="$HOME/.secrets/backup-key.txt"
# Split the backup into manageable chunks
./splitter -op split \
-file "$BACKUP_FILE" \
-size 100 \
-format MB \
-e \
-key-path "$KEY_FILE" \
-v
# Move chunks to backup location
mv file-data_large-database-backup.sql /backup/$(date +%Y%m%d)/
echo "Backup split and encrypted successfully!"
π» Programmatic Usage
Basic Split Example
package main
import (
"context"
"fmt"
"log"
"github.com/AlexanderEl/splitter"
)
func main() {
// Create splitter instance
s, err := splitter.NewSplit(
"/path/to/directory",
"large-file.bin",
true, // verbose
&splitter.EncryptionConfig{
IsEncrypted: false,
},
)
if err != nil {
log.Fatal(err)
}
// Configure split parameters
configs := splitter.SplitConfigs{
ChunkSize: 50, // 50 MB chunks
Format: "MB",
}
// Execute split operation
ctx := context.Background()
if err := s.Split(ctx, configs); err != nil {
log.Fatal(err)
}
fmt.Println("File split successfully!")
}
Split with Encryption
s, err := splitter.NewSplit(
"/path/to/directory",
"sensitive-data.zip",
true,
&splitter.EncryptionConfig{
IsEncrypted: true,
WriteToFile: true,
KeyFilePath: "/secure/location/encryption.key",
},
)
if err != nil {
log.Fatal(err)
}
configs := splitter.SplitConfigs{
ChunkSize: 10,
Format: "MB",
}
ctx := context.Background()
if err := s.Split(ctx, configs); err != nil {
log.Fatal(err)
}
// Export key for later use
key, err := s.ExportEncryptionKey()
if err != nil {
log.Fatal(err)
}
fmt.Printf("Encryption key: %x\n", key)
Merge Example
merger, err := splitter.NewSplit(
"/path/to/chunks/file-data_large-file.bin",
"reconstructed.bin",
true,
&splitter.EncryptionConfig{
IsEncrypted: false,
},
)
if err != nil {
log.Fatal(err)
}
ctx := context.Background()
if err := merger.Merge(ctx); err != nil {
log.Fatal(err)
}
fmt.Println("File merged successfully with checksum verification!")
Merge with Encryption
merger, err := splitter.NewSplit(
"/path/to/chunks/file-data_sensitive-data.zip",
"decrypted.zip",
true,
&splitter.EncryptionConfig{
IsEncrypted: true,
},
)
if err != nil {
log.Fatal(err)
}
// Load encryption key
if err := merger.SetEncryption("/secure/location/encryption.key"); err != nil {
log.Fatal(err)
}
ctx := context.Background()
if err := merger.Merge(ctx); err != nil {
log.Fatal(err)
}
Context Cancellation
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
if err := splitter.Split(ctx, configs); err != nil {
if errors.Is(err, context.DeadlineExceeded) {
log.Println("Operation timed out")
} else {
log.Fatal(err)
}
}
π API Documentation
Core Types
Splitter Interface
Defines operations for splitting and merging files.
type Splitter interface {
Split(ctx context.Context, configs SplitConfigs) error
SetEncryption(filePath string) error
ExportEncryptionKey() ([]byte, error)
Merge(ctx context.Context) error
}
Split Struct
Main implementation of the Splitter interface.
type Split struct {
FileName string
FilePath string
OutputDir string
ProgressFunc ProgressFunc
Verbose bool
CleanupOnError bool
MaxFileSize int64
MaxChunks int64
}
Constructor
NewSplit(filePath, fileName string, verbose bool, config *EncryptionConfig) (*Split, error)
Creates a new Split instance with specified configuration.
Parameters:
filePath- Directory containing the file (split) or directory with chunks (merge)fileName- Name of the file to split or output filename for mergeverbose- Enable progress reportingconfig- Encryption configuration
Returns: Split instance or error
Example:
splitter, err := splitter.NewSplit(
"/path/to/directory",
"large-file.bin",
true, // verbose
&splitter.EncryptionConfig{
IsEncrypted: false,
},
)
Configuration Types
SplitConfigs
Configuration for split operations.
type SplitConfigs struct {
ChunkSize uint // Size of each chunk
Format string // Format: "B", "KB", "MB", "GB"
}
Methods:
Validate() error- Validates the configuration
Example:
configs := splitter.SplitConfigs{
ChunkSize: 50,
Format: "MB",
}
if err := configs.Validate(); err != nil {
log.Fatal(err)
}
EncryptionConfig
Configuration for encryption settings.
type EncryptionConfig struct {
IsEncrypted bool // Enable encryption
WriteToFile bool // Write key to file
KeyFilePath string // Path for key file
}
Example:
encConfig := &splitter.EncryptionConfig{
IsEncrypted: true,
WriteToFile: true,
KeyFilePath: "/secure/location/key.txt",
}
Core Methods
Split(ctx context.Context, configs SplitConfigs) error
Splits a file into chunks based on configuration.
Parameters:
ctx- Context for cancellation/timeoutconfigs- Split configuration (chunk size and format)
Returns: Error if operation fails
Features:
- Validates file path and configuration
- Creates output directory with sanitized name
- Splits file into chunks with optional encryption
- Generates SHA-256 checksum file
- Concurrent checksumming for performance
- Automatic cleanup on error (if enabled)
Example:
ctx := context.Background()
configs := splitter.SplitConfigs{
ChunkSize: 10,
Format: "MB",
}
if err := splitter.Split(ctx, configs); err != nil {
log.Fatal(err)
}
Merge(ctx context.Context) error
Merges split chunks back into original file with integrity verification.
Parameters:
ctx- Context for cancellation/timeout
Returns: Error if operation fails
Features:
- Validates directory structure
- Reads and orders chunks sequentially
- Optional decryption of chunks
- SHA-256 checksum verification
- Atomic file creation (temp + rename)
- Automatic cleanup of temp files on error
Example:
merger, _ := splitter.NewSplit(
"/path/to/file-data_myfile",
"restored-file.bin",
true,
&splitter.EncryptionConfig{IsEncrypted: false},
)
ctx := context.Background()
if err := merger.Merge(ctx); err != nil {
log.Fatal(err)
}
SetEncryption(filePath string) error
Loads encryption key from file for merge operations.
Parameters:
filePath- Path to encryption key file
Returns: Error if key cannot be loaded
Example:
if err := merger.SetEncryption("/path/to/encryption-key.txt"); err != nil {
log.Fatal(err)
}
ExportEncryptionKey() ([]byte, error)
Exports the current encryption key for backup or sharing.
Returns: Encryption key bytes or error
Example:
key, err := splitter.ExportEncryptionKey()
if err != nil {
log.Fatal(err)
}
// Save to secure location
ioutil.WriteFile("/secure/backup-key.txt", key, 0600)
Helper Types
ProgressFunc
Function type for progress reporting callbacks.
type ProgressFunc func(current, total int64)
Example:
customProgress := func(current, total int64) {
percent := float64(current) / float64(total) * 100
fmt.Printf("\rProgress: %.2f%%", percent)
}
splitter.ProgressFunc = customProgress
Constants
const (
OutputDirPrefix = "file-data_"
ChecksumFileName = "checksum.sha256"
DefaultDirPermissions = 0755
DefaultFilePermissions = 0644
MinChunkSize = 1024 // 1 KB minimum
MaxFileSize = 107374182400 // 100 GB default
MaxChunks = 10000 // Maximum chunks per file
)
Error Types
var (
ErrInvalidPath = errors.New("invalid file path")
ErrDirectoryExists = errors.New("output directory already exists")
ErrFileTooLarge = errors.New("file exceeds maximum size")
ErrTooManyChunks = errors.New("would create too many chunks")
ErrEmptyDirectory = errors.New("directory is empty")
ErrChecksumMismatch = errors.New("checksum mismatch")
ErrEncryptionNotInitialized = errors.New("encryption service not initialized")
)
Error Handling
The library uses sentinel errors for clear error handling:
err := splitter.Split(ctx, configs)
if errors.Is(err, splitter.ErrDirectoryExists) {
// Handle existing directory
log.Println("Output directory already exists")
} else if errors.Is(err, splitter.ErrFileTooLarge) {
// Handle file size limit
log.Println("File is too large to split")
} else if errors.Is(err, splitter.ErrChecksumMismatch) {
// Handle corruption
log.Println("Merged file is corrupted")
}
π― Configuration Options
| Parameter | Type | Description | Valid Values |
|---|---|---|---|
ChunkSize |
uint | Size of each chunk | Any positive integer |
Format |
string | Unit for chunk size | B, KB, MB, GB |
Split Configuration
| Parameter | Type | Description | Valid Values |
|---|---|---|---|
ChunkSize |
uint | Size of each chunk | Any positive integer |
Format |
string | Unit for chunk size | B, KB, MB, GB |
Encryption Configuration
| Parameter | Type | Description | Default |
|---|---|---|---|
IsEncrypted |
bool | Enable encryption | false |
WriteToFile |
bool | Save key to file | false |
KeyFilePath |
string | Path for key file | ./encryption-key.txt |
Splitter Options
| Option | Type | Description | Default |
|---|---|---|---|
Verbose |
bool | Enable progress reporting | false |
CleanupOnError |
bool | Remove partial files on failure | true |
MaxFileSize |
int64 | Maximum file size to process | 100 GB |
MaxChunks |
int64 | Maximum number of chunks | 10,000 |
OutputDir |
string | Directory for output files | Current directory |
ProgressFunc |
ProgressFunc | Custom progress callback | Default progress bar |
π Security Features
Encryption
- AES-256-GCM - Industry-standard authenticated encryption with associated data (AEAD)
- Unique Keys - Each split operation generates a unique encryption key
- Automatic Key Management - Keys auto-generated or loaded from files
- Thread-Safe Encryption - All encryption operations are thread-safe
Data Integrity
- SHA-256 Checksums - Automatic checksum generation and verification
- Constant-Time Comparison - Prevents timing attacks on checksum validation
- Atomic File Operations - Temp file + rename ensures consistency
Path Security
- Path Traversal Protection - Validates and cleans all file paths
- Null Byte Filtering - Prevents null byte injection attacks
- Filename Sanitization - Removes dangerous characters from filenames
- Directory Validation - Ensures operations stay within intended directories
Operational Security
- Context Support - Proper cancellation prevents resource leaks
- Automatic Cleanup - Failed operations clean up partial files
- Secure File Permissions - Key files created with 0600 permissions
- Memory Safety - Thread-safe operations with proper locking
Best Practices
β
Always verify checksums after merge operations
β
Store encryption keys in secure locations with restricted permissions
β
Never commit encryption keys to version control
β
Use unique keys for different files or projects
β
Enable verbose mode for critical operations
β
Implement regular key backup procedures
β
Test recovery procedures before production use
β
Use context timeouts for long-running operations
Architecture
Split Process
- Validation: Validates file path, chunk size, and configuration
- Preparation: Creates output directory and calculates chunk count
- Concurrent Operations:
- Chunk Creation: Reads and writes file chunks with optional encryption
- Checksum Generation: Computes SHA-256 hash in parallel
- Atomic Writes: Uses temporary files and rename for atomicity
- Progress Reporting: Reports progress at configurable intervals
Merge Process
- Validation: Verifies directory structure and encryption setup
- Discovery: Identifies and orders data chunks
- Sequential Reconstruction: Reads chunks with optional decryption
- Checksum Verification: Validates reconstructed file integrity
- Atomic Completion: Renames temp file only after successful verification
Security Model
- Path Sanitization: All file paths are cleaned and validated
- Encryption: AES-256-GCM with unique keys per split operation
- Integrity: SHA-256 checksums protect against corruption
- Constant-Time Comparison: Prevents timing attacks on checksums
- Thread Safety: All encryption operations are thread-safe
β‘ Performance
Quick Reference:
| Operation | Speed | Throughput | Memory |
|---|---|---|---|
| Split | 682 Β΅s/op | ~1.47K ops/sec | 154 KB/op |
| Merge | 2.08 ms/op | ~481 ops/sec | 1.14 MB/op |
| Split (Encrypted) | 887 Β΅s/op | ~1.13K ops/sec | 1.27 MB/op |
Benchmarks on Intel Core i9-14900HX (32 threads) - averaged over 20 runs:
TEST_NAME ITERATIONS AVG_ITER_DURATION MEMORY_USED NUM_MEMORY_ALLOCATIONS
BenchmarkSplit-32 1690 682496 ns/op 154186 B/op 240 allocs/op
BenchmarkMerge-32 588 2078640 ns/op 1138959 B/op 135 allocs/op
BenchmarkSplitWithEncryption-32 1379 887136 ns/op 1269816 B/op 294 allocs/op
Performance Highlights:
- β‘ Split: ~682 Β΅s/op (~1,466 operations/sec) - Fast chunking with concurrent checksumming
- β‘ Merge: ~2.08 ms/op (~481 operations/sec) - Includes full integrity verification
- π Encrypted Split: ~887 Β΅s/op (~1,127 operations/sec) - AES-256 encryption overhead ~30%
- πΎ Memory Efficient: 154 KB for split, 1.14 MB for merge operations
- π§΅ Thread-Safe: All operations safe for concurrent use
- π Consistent: Results averaged over 20 benchmark runs for reliability
Note: Benchmarks measure individual operation cycles, not total file processing time. Actual throughput depends on file size, chunk size, storage speed, and system resources.
Optimizations
- Concurrent Operations - Checksumming and chunking run in parallel
- Buffer Reuse - Single buffer reused across all chunks
- Progress Throttling - Progress updates throttled for high-chunk-count operations
- Efficient I/O - Direct file I/O without intermediate buffers
- Atomic Writes - Temporary files + rename for consistency
Performance Tips
-
Choose Appropriate Chunk Sizes
// For network transfer - smaller chunks (10-50 MB) configs := SplitConfigs{ChunkSize: 10, Format: "MB"} // For local storage - larger chunks (100-500 MB) configs := SplitConfigs{ChunkSize: 100, Format: "MB"} -
Disable Progress for Maximum Speed
splitter.Verbose = false // No progress reporting overhead -
Use Context Timeouts for Large Files
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute) defer cancel()
π§ͺ Testing
The splitter package includes comprehensive testing with 82%+ test coverage.
Basic Testing
# Run all tests
go test -v
# Run tests with coverage
go test -cover
# Generate detailed coverage report
go test -coverprofile=coverage.out
go tool cover -html=coverage.out
# Run specific test
go test -v -run TestSplitAndMerge
# Run benchmarks
go test -bench=. -benchmem
Test Categories
- β Unit tests for all public APIs
- β Integration tests for split/merge workflows
- β Concurrency tests for thread safety
- β Fuzz tests for input validation
- β Benchmark tests for performance
- β Error injection tests for resilience
- β Encryption/decryption roundtrip tests
- β Path security validation tests
Running Specific Test Categories
# Run encryption tests
go test -v -run TestSplitAndMergeWithEncryption
# Run cancellation tests
go test -v -run TestCancellation
# Run validation tests
go test -v -run TestValidate
# Run fuzz tests (requires Go 1.18+)
go test -fuzz=FuzzValidateAndPreparePath -fuzztime=30s
Continuous Integration
# Complete CI test suite
go test -cover -v -timeout 30s
# With JSON output for CI tools
go test -cover -json > test-results.json
Expected Test Results
When all tests pass, you should see:
PASS
coverage: 82%+ of statements
ok github.com/AlexanderEl/splitter 15.234s
π Examples
Example 1: Basic File Split and Merge
package main
import (
"context"
"log"
"github.com/AlexanderEl/splitter"
)
func main() {
// Create splitter
s, err := splitter.NewSplit(
"/home/user/downloads",
"large-video.mp4",
true, // verbose
&splitter.EncryptionConfig{
IsEncrypted: false,
},
)
if err != nil {
log.Fatal(err)
}
// Configure split
configs := splitter.SplitConfigs{
ChunkSize: 50,
Format: "MB",
}
// Split file
ctx := context.Background()
if err := s.Split(ctx, configs); err != nil {
log.Fatal(err)
}
log.Println("File split successfully!")
// Later: Merge back
merger, err := splitter.NewSplit(
"/home/user/downloads/file-data_large-video.mp4",
"restored-video.mp4",
true,
&splitter.EncryptionConfig{IsEncrypted: false},
)
if err != nil {
log.Fatal(err)
}
if err := merger.Merge(ctx); err != nil {
log.Fatal(err)
}
log.Println("File merged and verified successfully!")
}
Example 2: Encrypted Split with Key Export
package main
import (
"context"
"io/ioutil"
"log"
"github.com/AlexanderEl/splitter"
)
func encryptedSplit() {
s, err := splitter.NewSplit(
"/sensitive",
"confidential.zip",
true,
&splitter.EncryptionConfig{
IsEncrypted: true,
WriteToFile: true,
KeyFilePath: "/secure/backup-key.txt",
},
)
if err != nil {
log.Fatal(err)
}
configs := splitter.SplitConfigs{
ChunkSize: 10,
Format: "MB",
}
ctx := context.Background()
if err := s.Split(ctx, configs); err != nil {
log.Fatal(err)
}
// Export key for backup
key, err := s.ExportEncryptionKey()
if err != nil {
log.Fatal(err)
}
// Save to additional secure location
if err := ioutil.WriteFile("/backup/key-backup.txt", key, 0600); err != nil {
log.Fatal(err)
}
log.Println("File split and encrypted. Key backed up.")
}
Example 3: Context with Timeout
package main
import (
"context"
"errors"
"log"
"time"
"github.com/AlexanderEl/splitter"
)
func splitWithTimeout() {
s, _ := splitter.NewSplit(
"/data",
"huge-file.bin",
true,
&splitter.EncryptionConfig{IsEncrypted: false},
)
configs := splitter.SplitConfigs{
ChunkSize: 100,
Format: "MB",
}
// Set 5 minute timeout
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
defer cancel()
if err := s.Split(ctx, configs); err != nil {
if errors.Is(err, context.DeadlineExceeded) {
log.Println("Operation timed out after 5 minutes")
} else {
log.Fatal(err)
}
}
}
Example 4: Concurrent Splitting (Multiple Files)
package main
import (
"context"
"log"
"sync"
"github.com/AlexanderEl/splitter"
)
func splitMultipleFiles(files []string) {
var wg sync.WaitGroup
errChan := make(chan error, len(files))
for _, file := range files {
wg.Add(1)
go func(f string) {
defer wg.Done()
s, err := splitter.NewSplit(
"/data",
f,
false, // no verbose for concurrent ops
&splitter.EncryptionConfig{IsEncrypted: false},
)
if err != nil {
errChan <- err
return
}
configs := splitter.SplitConfigs{
ChunkSize: 50,
Format: "MB",
}
ctx := context.Background()
if err := s.Split(ctx, configs); err != nil {
errChan <- err
}
}(file)
}
wg.Wait()
close(errChan)
// Check for errors
for err := range errChan {
log.Printf("Error splitting file: %v", err)
}
}
Example 5: Custom Progress Callback
package main
import (
"context"
"fmt"
"log"
"github.com/AlexanderEl/splitter"
)
func splitWithCustomProgress() {
s, _ := splitter.NewSplit(
"/data",
"file.bin",
true,
&splitter.EncryptionConfig{IsEncrypted: false},
)
// Custom progress function
s.ProgressFunc = func(current, total int64) {
percent := float64(current) / float64(total) * 100
fmt.Printf("\r[%-50s] %.2f%% (%d/%d chunks)",
progressBar(int(percent)),
percent,
current,
total)
}
configs := splitter.SplitConfigs{
ChunkSize: 10,
Format: "MB",
}
ctx := context.Background()
if err := s.Split(ctx, configs); err != nil {
log.Fatal(err)
}
fmt.Println() // New line after progress
}
func progressBar(percent int) string {
filled := percent / 2 // Scale to 50 chars
bar := ""
for i := 0; i < 50; i++ {
if i < filled {
bar += "="
} else {
bar += " "
}
}
return bar
}
Example 6: Merge with Verification
package main
import (
"context"
"crypto/sha256"
"fmt"
"io/ioutil"
"log"
"github.com/AlexanderEl/splitter"
)
func mergeAndVerify(originalFile, mergedFile string) {
// Merge chunks
merger, _ := splitter.NewSplit(
"file-data_myfile",
"restored.bin",
true,
&splitter.EncryptionConfig{IsEncrypted: false},
)
ctx := context.Background()
if err := merger.Merge(ctx); err != nil {
log.Fatal(err)
}
// Additional verification: Compare with original
original, err := ioutil.ReadFile(originalFile)
if err != nil {
log.Fatal(err)
}
merged, err := ioutil.ReadFile(mergedFile)
if err != nil {
log.Fatal(err)
}
originalHash := sha256.Sum256(original)
mergedHash := sha256.Sum256(merged)
if originalHash == mergedHash {
fmt.Println("β Verification successful: Files are identical")
} else {
fmt.Println("β Verification failed: Files differ")
}
}
β οΈ Security Considerations
When to Use This Library
β
Splitting large files for storage or transfer
β
Encrypting sensitive file chunks
β
Creating portable file archives
β
Distributing large files across multiple drives
β
Uploading large files to cloud storage with size limits
β
Backing up large databases in manageable chunks
β
Multi-threaded/concurrent file processing
When NOT to Use This Library
β Real-time streaming (use streaming protocols)
β Network file transfer (use rsync, FTP, etc.)
β Database backups (use database-specific tools)
β Version control (use Git LFS or similar)
β As a primary encryption tool (use dedicated encryption software)
Important Security Notes
- Key Management - Encryption keys are as sensitive as the data they protect. Never commit keys to version control
- Key Storage - Store keys with 0600 permissions in secure locations
- Checksum Verification - Always verify checksums after merge operations
- Threat Model - This library protects data at rest, not data in motion or in memory
- File Permissions - Ensure proper permissions on output directories
- Cleanup - Failed operations automatically clean up, but verify manually if needed
- Context Cancellation - Always use context timeouts for long-running operations
π« Limitations
- Maximum file size: 100 GB (configurable via
MaxFileSize) - Maximum chunks: 10,000 per file (configurable via
MaxChunks) - Minimum chunk size: 1 KB (configurable via
MinChunkSize) - Supported platforms: Linux, macOS, Windows (amd64, arm64)
- File system: Requires sufficient disk space for chunks (1.5x original file size during operations)
- Concurrent operations: File-level locking not implemented (avoid splitting same file twice)
Dependencies
- github.com/AlexanderEl/encryptor: Encryption service
π€ Contributing
Contributions are welcome! Please follow these guidelines:
Development Setup
- Fork the repository
- Clone your fork:
git clone https://github.com/YOUR_USERNAME/splitter.git - Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests
- Run tests:
go test -v - Ensure coverage stays above 80%:
go test -cover - Run benchmarks if performance-critical:
go test -bench=. -benchmem - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
Code Standards
- Follow Effective Go guidelines
- Run
gofmtandgo vetbefore committing - Add tests for new functionality
- Maintain test coverage above 80%
- Document all exported functions and types
- Update README for API changes
- Provide examples for new features
Testing Requirements
All pull requests must:
- β Pass all existing tests
- β Maintain or improve test coverage (80%+ required)
- β Include tests for new functionality
- β
Pass
go vetandgolintchecks - β Performance-critical changes must include benchmark results from 10+ runs
- β Document any performance changes >10% with justification
Pull Request Process
- Update README.md with details of changes
- Update CHANGELOG.md (if exists) with notable changes
- Ensure all tests pass and coverage is maintained
- Request review from maintainers
- Address any feedback or requested changes
πΊοΈ Roadmap
- Support for compression (gzip, zstd, lz4)
- Streaming support for large files to reduce memory usage
- Resume capability for interrupted split/merge operations
- Cloud storage integration (S3, GCS, Azure Blob)
- Parallel chunk processing for improved performance
- Web UI for file management
- Docker containerization
- Native support for archiving multiple files
- Incremental backup support
- Reed-Solomon error correction codes
- Multi-volume archive support
π Version History
v1.0.0 (Current)
- β¨ Initial release with file splitting and merging
- π AES-256 encryption support
- β SHA-256 checksum verification
- π§΅ Thread-safe operations
- β‘ Concurrent checksumming
- π‘οΈ Path traversal protection
- π Progress reporting
- π Context support for cancellation
- π§ͺ Comprehensive test suite (82%+ coverage)
- π Complete API documentation
- π₯οΈ Professional CLI tool
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
- Uses Go's excellent
crypto/aesandcrypto/sha256packages - Encryption powered by github.com/AlexanderEl/encryptor
- Inspired by common file splitting utilities (split, 7zip, HJSplit)
- Built with Go's excellent concurrency primitives
π Support
- π Bug Reports: Open an issue
- π‘ Feature Requests: Open an issue
- π¬ Discussions: GitHub Discussions
- π Documentation: Wiki
π¨βπ» Authors
- Alexander El - Initial work - @AlexanderEl
Made with β€οΈ for the Go community
If you find this library useful, please consider giving it a β on GitHub!
Note: This is production-grade software, but always test thoroughly with your specific use cases before deploying to production environments.
Documentation
ΒΆ
Index ΒΆ
Constants ΒΆ
This section is empty.
Variables ΒΆ
This section is empty.
Functions ΒΆ
This section is empty.
Types ΒΆ
type EncryptionConfig ΒΆ
type EncryptionConfig struct {
IsEncrypted bool // Flag for whether to use encryption
WriteToFile bool // Flag for whether to write passkey to file
KeyFilePath string // Path to passkey file location
}
EncryptionConfig defines configuration for enabling encryption
type ProgressFunc ΒΆ
type ProgressFunc func(current, total int64)
ProgressFunc is called to report progress
type Split ΒΆ
type Split struct {
FileName string
FilePath string
OutputDir string // Output directory for where to split/merge files
ProgressFunc ProgressFunc // The progress bar display func
Verbose bool // Flag for whether to add verbosity to output
CleanupOnError bool // Flag for cleaning up temporary files during split/merge failure
MaxFileSize int64 // Maximum file size to process
MaxChunks int64 // Maximum number of chunks
// contains filtered or unexported fields
}
Split represents a file splitting operation
func NewSplit ΒΆ
func NewSplit(filePath, fileName string, verbose bool, config *EncryptionConfig) (*Split, error)
NewSplit creates a new Split instance with defaults
func (*Split) ExportEncryptionKey ΒΆ
ExportEncryptionKey exports the encryption key for later use
func (*Split) SetEncryption ΒΆ
SetEncryption initializes encryption from a key file
type SplitConfigs ΒΆ
type SplitConfigs struct {
ChunkSize uint // Size of the file chunk we want to split into
Format string // Format of each chunk (K, KB, MB, GB)
}
SplitConfigs defines configuration for splitting operations
func (*SplitConfigs) Validate ΒΆ
func (sc *SplitConfigs) Validate() error
Validate checks if SplitConfigs are valid
type Splitter ΒΆ
type Splitter interface {
// Split the input file based on given configs
Split(ctx context.Context, configs SplitConfigs) error
// Set the encryption service to be used in Splitter
SetEncryption(filePath string) error
// Export the encryption key used by Splitter
ExportEncryptionKey() ([]byte, error)
// Merge files based on configs
Merge(ctx context.Context) error
}
Splitter defines operations for splitting and merging files