Caching & Performance¶
This document covers YAPFM's advanced caching and performance features, including intelligent caching, lazy loading, and streaming capabilities.
🚀 Overview¶
YAPFM provides powerful caching and performance features that significantly improve speed, memory efficiency, and usability:
- Intelligent Caching System: Smart caching with TTL, LRU eviction, and comprehensive statistics
- Lazy Loading: Memory-efficient loading of large sections
- Streaming Support: Process files larger than available RAM
- Unified Architecture: Centralized cache management and key generation
🧠 Intelligent Caching (CacheMixin)¶
Features¶
The new caching system provides:
- Automatic Caching: Values are automatically cached on first access
- TTL Support: Time-to-live for cached entries
- LRU Eviction: Least Recently Used eviction when cache is full
- Memory Management: Size-based eviction to prevent memory issues
- Statistics Tracking: Hit/miss ratios and performance metrics
- Pattern Invalidation: Invalidate cache entries using wildcard patterns
- Thread Safety: Safe for use in multi-threaded environments
Basic Usage¶
from yapfm import YAPFileManager
# Enable caching
fm = YAPFileManager(
"config.json",
enable_cache=True,
cache_size=1000, # Maximum number of cached entries
cache_ttl=3600 # TTL in seconds (1 hour)
)
# First access loads from file and caches
host = fm.get_value("database.host")
print(f"Database host: {host}")
# Subsequent accesses return from cache (much faster)
host_cached = fm.get_value("database.host") # Returns from cache
Advanced Caching¶
# Get cache statistics
stats = fm.get_cache_stats()
print(f"Cache hits: {stats['unified_cache']['hits']}")
print(f"Cache misses: {stats['unified_cache']['misses']}")
print(f"Hit rate: {stats['unified_cache']['hit_rate']:.2%}")
# Invalidate specific patterns
fm.invalidate_cache("key:database.*") # Invalidate all database keys
# Clear all cache
fm.clear_cache()
Performance Benefits¶
- Speed: Cached values are returned instantly
- Memory Efficient: LRU eviction prevents memory bloat
- Configurable: Adjust cache size and TTL based on your needs
- Statistics: Monitor cache performance and optimize usage
🎯 Lazy Loading (LazySectionsMixin)¶
Features¶
Lazy loading provides:
- Memory Efficiency: Sections are loaded only when accessed
- Performance: Avoid loading large sections unnecessarily
- Cache Integration: Works seamlessly with the unified cache system
- Automatic Invalidation: Cache invalidation when sections are modified
- Statistics: Monitor lazy loading performance
Basic Usage¶
from yapfm import YAPFileManager
# Enable lazy loading
fm = YAPFileManager(
"large_config.json",
enable_lazy_loading=True
)
# Section is not loaded until accessed
db_section = fm.get_section("database") # Loads only when accessed
print(f"Database host: {db_section['host']}")
# Subsequent accesses return from lazy cache
db_section_again = fm.get_section("database") # Returns from cache
Advanced Lazy Loading¶
# Force immediate loading (bypass lazy loading)
db_section = fm.get_section("database", lazy=False)
# Update section with cache invalidation
fm.set_section({
"host": "newhost",
"port": 3306
}, dot_key="database", update_lazy_cache=True)
# Get lazy loading statistics
stats = fm.get_lazy_stats()
print(f"Total sections: {stats['total_sections']}")
print(f"Loaded sections: {stats['loaded_sections']}")
# Clear lazy cache
fm.clear_lazy_cache()
Memory Benefits¶
- Reduced Memory Usage: Only load sections when needed
- Faster Startup: Don't load entire file at startup
- Selective Loading: Load only the sections you actually use
- Automatic Management: Cache invalidation when sections change
🌊 Streaming Support (StreamingMixin)¶
Features¶
Streaming provides:
- Large File Support: Process files larger than available RAM
- Chunked Reading: Process files in configurable chunks
- Memory Efficient: Constant memory usage regardless of file size
- Multiple Formats: Support for different file encodings
- Progress Tracking: Monitor processing progress
- Search Capabilities: Search within large files
- Section Extraction: Extract specific sections from large files
Basic Usage¶
from yapfm import YAPFileManager
# Enable streaming
fm = YAPFileManager(
"large_file.txt",
enable_streaming=True
)
# Stream file in chunks
for chunk in fm.stream_file(chunk_size=1024*1024): # 1MB chunks
process_chunk(chunk)
# Stream line by line
for line in fm.stream_lines():
if "error" in line.lower():
print(f"Error found: {line}")
Advanced Streaming¶
# Stream sections with markers
for section in fm.stream_sections("[", "]"):
print(f"Section: {section['name']}")
print(f"Content: {section['content']}")
# Process with custom function
def count_lines(chunk):
return chunk.count('\n')
def progress_callback(progress):
print(f"Progress: {progress:.1%}")
results = list(fm.process_large_file(count_lines, progress_callback))
total_lines = sum(results)
# Search in large file
for match in fm.search_in_file("error", case_sensitive=False):
print(f"Found: {match['match']}")
print(f"Context: {match['context']}")
# Get file information
size = fm.get_file_size()
progress = fm.get_file_progress()
estimated_time = fm.estimate_processing_time(count_lines)
Performance Benefits¶
- Memory Efficient: Process files of any size
- Configurable: Adjust chunk size based on available memory
- Progress Tracking: Monitor long-running operations
- Search Capabilities: Find patterns in large files efficiently
🏗️ Unified Architecture¶
Centralized Cache Management¶
The new architecture provides:
- Unified Cache: Single cache instance for all operations
- Key Generation: Centralized key generation with caching
- Statistics: Comprehensive statistics across all caching mechanisms
- Memory Management: Centralized memory management and cleanup
Key Generation Optimization¶
# Key generation is now cached for performance
key1 = fm._generate_cache_key("database.host", None, None, "key")
key2 = fm._generate_cache_key("database.host", None, None, "key") # Returns cached key
# Clear key generation cache
fm.clear_key_cache()
Comprehensive Statistics¶
# Get unified statistics
stats = fm.get_cache_stats()
print("Unified Cache Stats:")
print(f" Hits: {stats['unified_cache']['hits']}")
print(f" Misses: {stats['unified_cache']['misses']}")
print(f" Hit Rate: {stats['unified_cache']['hit_rate']:.2%}")
print("Lazy Sections Stats:")
print(f" Total Sections: {stats['lazy_sections']['total_sections']}")
print(f" Loaded Sections: {stats['lazy_sections']['loaded_sections']}")
print("Key Cache Stats:")
print(f" Size: {stats['key_cache']['size']}")
🔧 Configuration Examples¶
High-Performance Configuration¶
# For high-performance applications
fm = YAPFileManager(
"config.json",
enable_cache=True,
cache_size=10000, # Large cache
cache_ttl=7200, # 2 hours TTL
enable_lazy_loading=True,
enable_streaming=True
)
Memory-Conscious Configuration¶
# For memory-constrained environments
fm = YAPFileManager(
"config.json",
enable_cache=True,
cache_size=100, # Small cache
cache_ttl=300, # 5 minutes TTL
enable_lazy_loading=True,
enable_streaming=True
)
Development Configuration¶
# For development with frequent changes
fm = YAPFileManager(
"config.json",
enable_cache=True,
cache_size=1000,
cache_ttl=60, # Short TTL for development
enable_lazy_loading=False, # Disable for easier debugging
enable_streaming=True
)
📊 Performance Monitoring¶
Cache Performance¶
# Monitor cache performance
stats = fm.get_cache_stats()
hit_rate = stats['unified_cache']['hit_rate']
if hit_rate < 0.8: # Less than 80% hit rate
print("Warning: Low cache hit rate, consider increasing cache size")
fm.clear_cache() # Clear cache and start fresh
Memory Usage¶
# Monitor memory usage
stats = fm.get_cache_stats()
memory_usage = stats['unified_cache']['memory_usage_mb']
if memory_usage > 50: # More than 50MB
print("Warning: High memory usage, consider reducing cache size")
fm.clear_cache()
Lazy Loading Efficiency¶
# Monitor lazy loading efficiency
stats = fm.get_lazy_stats()
loaded_ratio = stats['loaded_sections'] / stats['total_sections']
if loaded_ratio > 0.5: # More than 50% of sections loaded
print("Warning: High lazy loading ratio, consider disabling lazy loading")
🚨 Best Practices¶
Caching Best Practices¶
- Choose Appropriate Cache Size: Balance memory usage with performance
- Set Reasonable TTL: Don't cache data that changes frequently
- Monitor Hit Rates: Aim for 80%+ hit rate
- Use Pattern Invalidation: Invalidate related cache entries together
- Clear Cache When Needed: Clear cache when data changes significantly
Lazy Loading Best Practices¶
- Use for Large Sections: Only use lazy loading for sections that are large
- Monitor Memory Usage: Keep track of loaded sections
- Invalidate When Needed: Update lazy cache when sections change
- Consider Access Patterns: Disable lazy loading if sections are accessed frequently
Streaming Best Practices¶
- Choose Appropriate Chunk Size: Balance memory usage with I/O efficiency
- Use Progress Callbacks: Monitor long-running operations
- Handle Errors Gracefully: Streaming operations can fail on large files
- Consider File Size: Use streaming for files larger than available RAM
- Test with Different Chunk Sizes: Find the optimal chunk size for your use case
🔄 Migration Guide¶
From Basic to Cached¶
# Old way
host = fm.get_key("database.host")
# New way with caching
host = fm.get_value("database.host") # Automatically cached
From Immediate to Lazy Loading¶
# Old way
section = fm.get_section("database")
# New way with lazy loading
section = fm.get_section("database", lazy=True) # Lazy loaded
Adding Streaming Support¶
# For large files
fm = YAPFileManager("large_file.txt", enable_streaming=True)
# Process in chunks
for chunk in fm.stream_file():
process_chunk(chunk)
🎯 Use Cases¶
Configuration Management¶
# High-performance configuration management
fm = YAPFileManager(
"app_config.json",
enable_cache=True,
cache_size=1000,
cache_ttl=3600,
enable_lazy_loading=True
)
# Fast access to frequently used values
db_host = fm.get_value("database.host")
api_key = fm.get_value("api.key")
Large File Processing¶
# Process large log files
fm = YAPFileManager("access.log", enable_streaming=True)
# Search for errors
for match in fm.search_in_file("ERROR", case_sensitive=False):
print(f"Error at line {match['line_number']}: {match['match']}")
Memory-Efficient Data Access¶
# Process large configuration files
fm = YAPFileManager(
"large_config.json",
enable_lazy_loading=True,
enable_cache=True
)
# Only load sections when needed
if user_needs_database_config:
db_config = fm.get_section("database")
process_database_config(db_config)
🔮 Future Enhancements¶
Planned enhancements include:
- Distributed Caching: Support for Redis and other distributed caches
- Compression: Automatic compression of cached data
- Encryption: Encrypted caching for sensitive data
- Metrics: More detailed performance metrics
- Profiling: Built-in profiling tools
- Visualization: Cache performance visualization tools
For more information about these features, see the API Reference and Examples.