Caching & Performance¶

This document covers YAPFM's advanced caching and performance features, including intelligent caching, lazy loading, and streaming capabilities.

🚀 Overview¶

YAPFM provides powerful caching and performance features that significantly improve speed, memory efficiency, and usability:

Intelligent Caching System: Smart caching with TTL, LRU eviction, and comprehensive statistics
Lazy Loading: Memory-efficient loading of large sections
Streaming Support: Process files larger than available RAM
Unified Architecture: Centralized cache management and key generation

🧠 Intelligent Caching (CacheMixin)¶

Features¶

The new caching system provides:

Automatic Caching: Values are automatically cached on first access
TTL Support: Time-to-live for cached entries
LRU Eviction: Least Recently Used eviction when cache is full
Memory Management: Size-based eviction to prevent memory issues
Statistics Tracking: Hit/miss ratios and performance metrics
Pattern Invalidation: Invalidate cache entries using wildcard patterns
Thread Safety: Safe for use in multi-threaded environments

Basic Usage¶

from yapfm import YAPFileManager

# Enable caching
fm = YAPFileManager(
    "config.json",
    enable_cache=True,
    cache_size=1000,      # Maximum number of cached entries
    cache_ttl=3600        # TTL in seconds (1 hour)
)

# First access loads from file and caches
host = fm.get_value("database.host")
print(f"Database host: {host}")

# Subsequent accesses return from cache (much faster)
host_cached = fm.get_value("database.host")  # Returns from cache

Advanced Caching¶

# Get cache statistics
stats = fm.get_cache_stats()
print(f"Cache hits: {stats['unified_cache']['hits']}")
print(f"Cache misses: {stats['unified_cache']['misses']}")
print(f"Hit rate: {stats['unified_cache']['hit_rate']:.2%}")

# Invalidate specific patterns
fm.invalidate_cache("key:database.*")  # Invalidate all database keys

# Clear all cache
fm.clear_cache()

Performance Benefits¶

Speed: Cached values are returned instantly
Memory Efficient: LRU eviction prevents memory bloat
Configurable: Adjust cache size and TTL based on your needs
Statistics: Monitor cache performance and optimize usage

🎯 Lazy Loading (LazySectionsMixin)¶

Features¶

Lazy loading provides:

Memory Efficiency: Sections are loaded only when accessed
Performance: Avoid loading large sections unnecessarily
Cache Integration: Works seamlessly with the unified cache system
Automatic Invalidation: Cache invalidation when sections are modified
Statistics: Monitor lazy loading performance

Basic Usage¶

from yapfm import YAPFileManager

# Enable lazy loading
fm = YAPFileManager(
    "large_config.json",
    enable_lazy_loading=True
)

# Section is not loaded until accessed
db_section = fm.get_section("database")  # Loads only when accessed
print(f"Database host: {db_section['host']}")

# Subsequent accesses return from lazy cache
db_section_again = fm.get_section("database")  # Returns from cache

Advanced Lazy Loading¶

# Force immediate loading (bypass lazy loading)
db_section = fm.get_section("database", lazy=False)

# Update section with cache invalidation
fm.set_section({
    "host": "newhost",
    "port": 3306
}, dot_key="database", update_lazy_cache=True)

# Get lazy loading statistics
stats = fm.get_lazy_stats()
print(f"Total sections: {stats['total_sections']}")
print(f"Loaded sections: {stats['loaded_sections']}")

# Clear lazy cache
fm.clear_lazy_cache()

Memory Benefits¶

Reduced Memory Usage: Only load sections when needed
Faster Startup: Don't load entire file at startup
Selective Loading: Load only the sections you actually use
Automatic Management: Cache invalidation when sections change

🌊 Streaming Support (StreamingMixin)¶

Features¶

Streaming provides:

Large File Support: Process files larger than available RAM
Chunked Reading: Process files in configurable chunks
Memory Efficient: Constant memory usage regardless of file size
Multiple Formats: Support for different file encodings
Progress Tracking: Monitor processing progress
Search Capabilities: Search within large files
Section Extraction: Extract specific sections from large files

Basic Usage¶

from yapfm import YAPFileManager

# Enable streaming
fm = YAPFileManager(
    "large_file.txt",
    enable_streaming=True
)

# Stream file in chunks
for chunk in fm.stream_file(chunk_size=1024*1024):  # 1MB chunks
    process_chunk(chunk)

# Stream line by line
for line in fm.stream_lines():
    if "error" in line.lower():
        print(f"Error found: {line}")

Advanced Streaming¶

# Stream sections with markers
for section in fm.stream_sections("[", "]"):
    print(f"Section: {section['name']}")
    print(f"Content: {section['content']}")

# Process with custom function
def count_lines(chunk):
    return chunk.count('\n')

def progress_callback(progress):
    print(f"Progress: {progress:.1%}")

results = list(fm.process_large_file(count_lines, progress_callback))
total_lines = sum(results)

# Search in large file
for match in fm.search_in_file("error", case_sensitive=False):
    print(f"Found: {match['match']}")
    print(f"Context: {match['context']}")

# Get file information
size = fm.get_file_size()
progress = fm.get_file_progress()
estimated_time = fm.estimate_processing_time(count_lines)

Performance Benefits¶

Memory Efficient: Process files of any size
Configurable: Adjust chunk size based on available memory
Progress Tracking: Monitor long-running operations
Search Capabilities: Find patterns in large files efficiently

🏗️ Unified Architecture¶

Centralized Cache Management¶

The new architecture provides:

Unified Cache: Single cache instance for all operations
Key Generation: Centralized key generation with caching
Statistics: Comprehensive statistics across all caching mechanisms
Memory Management: Centralized memory management and cleanup

Key Generation Optimization¶

# Key generation is now cached for performance
key1 = fm._generate_cache_key("database.host", None, None, "key")
key2 = fm._generate_cache_key("database.host", None, None, "key")  # Returns cached key

# Clear key generation cache
fm.clear_key_cache()

Comprehensive Statistics¶

# Get unified statistics
stats = fm.get_cache_stats()
print("Unified Cache Stats:")
print(f"  Hits: {stats['unified_cache']['hits']}")
print(f"  Misses: {stats['unified_cache']['misses']}")
print(f"  Hit Rate: {stats['unified_cache']['hit_rate']:.2%}")

print("Lazy Sections Stats:")
print(f"  Total Sections: {stats['lazy_sections']['total_sections']}")
print(f"  Loaded Sections: {stats['lazy_sections']['loaded_sections']}")

print("Key Cache Stats:")
print(f"  Size: {stats['key_cache']['size']}")

🔧 Configuration Examples¶

High-Performance Configuration¶

# For high-performance applications
fm = YAPFileManager(
    "config.json",
    enable_cache=True,
    cache_size=10000,     # Large cache
    cache_ttl=7200,       # 2 hours TTL
    enable_lazy_loading=True,
    enable_streaming=True
)

Memory-Conscious Configuration¶

# For memory-constrained environments
fm = YAPFileManager(
    "config.json",
    enable_cache=True,
    cache_size=100,       # Small cache
    cache_ttl=300,        # 5 minutes TTL
    enable_lazy_loading=True,
    enable_streaming=True
)

Development Configuration¶

# For development with frequent changes
fm = YAPFileManager(
    "config.json",
    enable_cache=True,
    cache_size=1000,
    cache_ttl=60,         # Short TTL for development
    enable_lazy_loading=False,  # Disable for easier debugging
    enable_streaming=True
)

📊 Performance Monitoring¶

Cache Performance¶

# Monitor cache performance
stats = fm.get_cache_stats()
hit_rate = stats['unified_cache']['hit_rate']

if hit_rate < 0.8:  # Less than 80% hit rate
    print("Warning: Low cache hit rate, consider increasing cache size")
    fm.clear_cache()  # Clear cache and start fresh

Memory Usage¶

# Monitor memory usage
stats = fm.get_cache_stats()
memory_usage = stats['unified_cache']['memory_usage_mb']

if memory_usage > 50:  # More than 50MB
    print("Warning: High memory usage, consider reducing cache size")
    fm.clear_cache()

Lazy Loading Efficiency¶

# Monitor lazy loading efficiency
stats = fm.get_lazy_stats()
loaded_ratio = stats['loaded_sections'] / stats['total_sections']

if loaded_ratio > 0.5:  # More than 50% of sections loaded
    print("Warning: High lazy loading ratio, consider disabling lazy loading")

🚨 Best Practices¶

Caching Best Practices¶

Choose Appropriate Cache Size: Balance memory usage with performance
Set Reasonable TTL: Don't cache data that changes frequently
Monitor Hit Rates: Aim for 80%+ hit rate
Use Pattern Invalidation: Invalidate related cache entries together
Clear Cache When Needed: Clear cache when data changes significantly

Lazy Loading Best Practices¶

Use for Large Sections: Only use lazy loading for sections that are large
Monitor Memory Usage: Keep track of loaded sections
Invalidate When Needed: Update lazy cache when sections change
Consider Access Patterns: Disable lazy loading if sections are accessed frequently

Streaming Best Practices¶

Choose Appropriate Chunk Size: Balance memory usage with I/O efficiency
Use Progress Callbacks: Monitor long-running operations
Handle Errors Gracefully: Streaming operations can fail on large files
Consider File Size: Use streaming for files larger than available RAM
Test with Different Chunk Sizes: Find the optimal chunk size for your use case

🔄 Migration Guide¶

From Basic to Cached¶

# Old way
host = fm.get_key("database.host")

# New way with caching
host = fm.get_value("database.host")  # Automatically cached

From Immediate to Lazy Loading¶

# Old way
section = fm.get_section("database")

# New way with lazy loading
section = fm.get_section("database", lazy=True)  # Lazy loaded

Adding Streaming Support¶

# For large files
fm = YAPFileManager("large_file.txt", enable_streaming=True)

# Process in chunks
for chunk in fm.stream_file():
    process_chunk(chunk)

🎯 Use Cases¶

Configuration Management¶

# High-performance configuration management
fm = YAPFileManager(
    "app_config.json",
    enable_cache=True,
    cache_size=1000,
    cache_ttl=3600,
    enable_lazy_loading=True
)

# Fast access to frequently used values
db_host = fm.get_value("database.host")
api_key = fm.get_value("api.key")

Large File Processing¶

# Process large log files
fm = YAPFileManager("access.log", enable_streaming=True)

# Search for errors
for match in fm.search_in_file("ERROR", case_sensitive=False):
    print(f"Error at line {match['line_number']}: {match['match']}")

Memory-Efficient Data Access¶

# Process large configuration files
fm = YAPFileManager(
    "large_config.json",
    enable_lazy_loading=True,
    enable_cache=True
)

# Only load sections when needed
if user_needs_database_config:
    db_config = fm.get_section("database")
    process_database_config(db_config)

🔮 Future Enhancements¶

Planned enhancements include:

Distributed Caching: Support for Redis and other distributed caches
Compression: Automatic compression of cached data
Encryption: Encrypted caching for sensitive data
Metrics: More detailed performance metrics
Profiling: Built-in profiling tools
Visualization: Cache performance visualization tools

For more information about these features, see the API Reference and Examples.