Filesystem Corruption Data Loss - Emergency Troubleshooting Steps
Filesystem corruption detected right now? Quick steps to assess damage, prevent further data loss, and recover data within minutes.
Filesystem Corruption Data Loss - Emergency Troubleshooting Steps
Filesystem corruption detected, data may be lost. This guide gives you immediate steps to assess damage, prevent further data loss, and recover data—now. No theory, just action.
For setting up monitoring to prevent this in the future, see Filesystem Health Monitoring Guide after you've resolved the immediate crisis.
60-Second Triage
Run these commands in order:
# Step 1: Confirm filesystem corruption (takes 5 seconds)
dmesg | grep -i "filesystem\|error\|corrupt\|i/o error" | tail -20
# Look for filesystem errors or corruption warnings
# Step 2: Check filesystem status (takes 5 seconds)
df -hT
# Check if filesystem is mounted read-only or shows errors
# Step 3: Assess damage (takes 10 seconds)
fsck -n /dev/sda1 # ext4 (read-only check)
# OR
xfs_repair -n /dev/sda1 # XFS (read-only check)
# Review output for corruption extent
Common Symptoms and Quick Fixes
| Symptom | Likely Cause | Quick Fix |
|---|---|---|
| Filesystem mounted read-only | Corruption detected | Run filesystem check, backup data, repair filesystem |
| I/O errors in logs | Disk hardware failure | Check disk health (SMART), backup data, replace disk |
| Missing files | Metadata corruption | Run filesystem check, recover from backup |
| Cannot mount filesystem | Severe corruption | Run filesystem repair, restore from backup if needed |
| Data corruption | Block-level corruption | Run filesystem check, verify data integrity |
How to Detect Filesystem Corruption
Automatic Detection with Zuzia.app
Zuzia.app automatically monitors filesystem health on your server through its agent-based system. The system:
- Checks filesystem status every few minutes automatically
- Stores all filesystem health data historically in the database
- Sends alerts when filesystem errors or corruption are detected
- Tracks filesystem health trends over time
- Uses AI analysis (full package) to detect unusual patterns
You'll receive notifications via email or other configured channels when filesystem corruption is detected, allowing you to respond quickly before data loss occurs.
Manual Detection Methods
You can also check filesystem corruption manually using commands that Zuzia.app can execute:
# Check for filesystem errors
dmesg | grep -i "filesystem\|error\|corrupt"
# Check filesystem status
df -hT
# Check ext4 filesystem (read-only)
fsck.ext4 -n /dev/sda1
# Check XFS filesystem (read-only)
xfs_repair -n /dev/sda1
# Check filesystem mount status
mount | grep /dev/sda1
Add these commands as scheduled tasks in Zuzia.app to monitor filesystem health continuously and receive alerts when corruption is detected.
Common Causes of Filesystem Corruption
1. Disk Hardware Failures
Disk hardware failures can cause filesystem corruption:
Signs:
- I/O errors in system logs
- Disk SMART errors
- Intermittent filesystem errors
- System crashes or freezes
Solutions:
- Check disk health with SMART utilities
- Replace failing disks immediately
- Run filesystem repair after disk replacement
- Restore data from backups if needed
2. Improper System Shutdown
Improper shutdowns can cause filesystem corruption:
Signs:
- Filesystem errors after power loss
- Incomplete writes detected
- Journal corruption (ext4)
- Metadata inconsistencies
Solutions:
- Run filesystem check after improper shutdown
- Repair filesystem if corruption detected
- Implement proper shutdown procedures
- Use UPS to prevent power loss
3. Software Bugs or Kernel Issues
Software bugs can cause filesystem corruption:
Signs:
- Corruption after software updates
- Kernel errors in logs
- Filesystem driver issues
- Application-level corruption
Solutions:
- Review recent software changes
- Check kernel logs for errors
- Update filesystem drivers if needed
- Report bugs to software vendors
Step-by-Step Solutions for Filesystem Corruption
Step 1: Assess Corruption Extent
When filesystem corruption is detected:
-
Check Current Filesystem Status:
- View Zuzia.app dashboard for current filesystem health
- Check filesystem mount status with
df -hT - Review kernel messages for filesystem errors
- Run read-only filesystem check
-
Identify Corruption Type:
- Review filesystem check output
- Check for metadata corruption vs data corruption
- Assess corruption extent
- Identify affected files or directories
Step 2: Prevent Further Data Loss
Once you identify filesystem corruption:
-
Backup Critical Data Immediately:
- Copy critical files to safe location if filesystem is still accessible
- Use
ddto create disk image if severe corruption - Document corruption extent for recovery planning
-
Mount Filesystem Read-Only (if possible):
- Remount filesystem as read-only to prevent further writes
- Use
mount -o remount,ro /mount/pointif filesystem is mounted - Prevent additional corruption from writes
Step 3: Repair Filesystem
Based on corruption assessment:
-
Run Filesystem Repair:
- For ext4:
fsck.ext4 -p /dev/sda1(auto-repair) - For XFS:
xfs_repair /dev/sda1(requires unmount) - For Btrfs:
btrfs check --repair /dev/sda1(use with caution) - Monitor repair progress
- For ext4:
-
Verify Repair Success:
- Run filesystem check again to verify repair
- Check filesystem mount status
- Verify critical files are accessible
- Test filesystem read/write operations
Step 4: Recover Data
If data loss occurred:
-
Restore from Backups:
- Identify lost or corrupted files
- Restore from recent backups
- Verify restored data integrity
- Update applications if needed
-
Recover from Disk Image (if available):
- Use disk image created before repair
- Extract files from disk image
- Verify recovered data integrity
Monitoring Filesystem Corruption with Zuzia.app
Automatic Filesystem Health Monitoring
Zuzia.app provides comprehensive filesystem health monitoring:
- Automatic checking: Filesystem health is checked automatically every few minutes
- Historical data: All filesystem health data stored for trend analysis
- Alerts: Receive notifications when filesystem corruption is detected
- Multi-server monitoring: Monitor filesystem health across all servers simultaneously
AI-Powered Filesystem Analysis (Full Package)
If you have Zuzia.app's full package:
- Pattern detection: AI identifies unusual filesystem patterns
- Anomaly detection: Detects filesystem corruption early
- Predictive analysis: Predicts potential filesystem problems before they occur
- Recovery suggestions: Recommends recovery procedures based on corruption type
Custom Filesystem Monitoring Commands
Add custom commands for detailed filesystem analysis:
# Check filesystem errors
dmesg | grep -i "filesystem\|error\|corrupt"
# Check filesystem status
df -hT
# Check ext4 filesystem health
tune2fs -l /dev/sda1 | grep "Filesystem state"
# Check XFS filesystem health
xfs_info /dev/sda1
Schedule these commands in Zuzia.app to monitor filesystem health continuously and receive alerts when corruption is detected.
Best Practices for Preventing Filesystem Corruption
1. Monitor Filesystem Health Continuously
Don't wait for problems to occur:
- Use Zuzia.app for continuous filesystem health monitoring
- Set up alerts before corruption becomes severe
- Review filesystem health trends regularly
- Plan maintenance based on filesystem health data
2. Maintain Regular Backups
Backups are essential for recovery:
- Backup critical data regularly
- Test backup restoration procedures
- Store backups on separate storage
- Verify backup integrity regularly
3. Use Reliable Storage Hardware
Hardware reliability matters:
- Use quality storage devices
- Monitor disk health with SMART
- Replace aging disks proactively
- Use RAID for redundancy
4. Implement Proper Shutdown Procedures
Proper shutdowns prevent corruption:
- Use proper shutdown commands
- Implement UPS for power protection
- Avoid forced shutdowns
- Monitor system power status
5. Keep System Software Updated
Software updates fix bugs:
- Update filesystem drivers regularly
- Apply kernel updates promptly
- Update filesystem tools
- Review changelogs for fixes
Troubleshooting Filesystem Corruption: Complete Workflow
Immediate Response (When Corruption is Detected)
-
Assess Corruption:
- Check filesystem status and errors
- Run read-only filesystem check
- Identify corruption extent
- Document findings
-
Prevent Further Loss:
- Backup critical data if accessible
- Mount filesystem read-only
- Stop writes to affected filesystem
- Create disk image if severe
-
Plan Recovery:
- Review backup availability
- Plan filesystem repair procedure
- Schedule maintenance window if needed
- Prepare recovery tools
Long-Term Solutions
-
Repair Filesystem:
- Run filesystem repair during maintenance window
- Verify repair success
- Test filesystem operations
- Restore data from backups if needed
-
Investigate Root Cause:
- Review disk health and hardware status
- Check for software bugs or kernel issues
- Review system logs for errors
- Identify and fix underlying causes
-
Prevent Recurrence:
- Implement better filesystem monitoring
- Improve backup procedures
- Replace failing hardware
- Update system software
Related guides, recipes, and problems
-
For filesystem monitoring strategy and prevention, see:
-
To check filesystem health proactively, use:
-
For related storage incidents and long-term prevention, combine this problem with:
FAQ: Common Questions About Filesystem Corruption
How do I know if my filesystem is corrupted?
Zuzia.app automatically monitors filesystem health and sends alerts when corruption is detected. You can also check manually using dmesg | grep -i "filesystem\|error\|corrupt" or running filesystem checks. Symptoms include I/O errors, missing files, or filesystem mounted read-only.
What should I do immediately when filesystem corruption is detected?
When filesystem corruption is detected, immediately backup critical data if filesystem is still accessible, mount filesystem read-only to prevent further writes, assess corruption extent using read-only filesystem check, and plan filesystem repair during maintenance window.
Can filesystem corruption cause permanent data loss?
Yes, filesystem corruption can cause permanent data loss if not addressed promptly. However, early detection through monitoring allows you to repair filesystems before corruption spreads. Regular backups are essential for data recovery regardless of filesystem health monitoring.
How can Zuzia.app help prevent filesystem corruption?
Zuzia.app helps prevent filesystem corruption by monitoring filesystem health continuously, alerting you before corruption becomes severe, tracking filesystem health trends over time, and using AI analysis (full package) to detect patterns and predict potential problems. You can also use Zuzia.app to monitor disk health and detect hardware failures early.
Does AI analysis help with filesystem corruption problems?
Yes, if you have Zuzia.app's full package, AI analysis can detect filesystem corruption patterns, identify early warning signs, predict potential filesystem problems before they occur, suggest recovery procedures based on corruption type, and correlate filesystem issues with other metrics to identify root causes.
Can I monitor filesystem health on multiple servers simultaneously?
Yes, Zuzia.app allows you to add multiple servers and monitor filesystem health across all of them simultaneously. Each server has its own filesystem metrics and can be configured independently. This helps you identify which servers need attention and plan maintenance across your infrastructure.
How often should I check filesystem health?
Zuzia.app checks filesystem health automatically every few minutes. For critical production servers, this frequency is usually sufficient. You can also add custom commands to check filesystem health more frequently if needed. The key is continuous monitoring rather than occasional checks, which Zuzia.app provides automatically.
What's the difference between filesystem corruption and disk failure?
Filesystem corruption refers to logical errors in filesystem structure or metadata. Disk failure refers to physical hardware problems with storage devices. Filesystem corruption can be repaired, while disk failures require hardware replacement. Both can cause data loss and should be monitored.
Can I set up automatic actions when filesystem corruption is detected?
Yes, Zuzia.app allows you to configure automatic actions when filesystem corruption is detected. You can set up backup scripts, mount filesystem read-only, send team notifications, and other automated responses. This helps you respond to filesystem corruption automatically without manual intervention.
How does historical filesystem data help with prevention?
Historical filesystem data collected by Zuzia.app shows health trends over time, allowing you to identify degradation patterns, predict when filesystem problems might occur, plan maintenance proactively, and make data-driven decisions about storage upgrades. The AI analysis (full package) can automatically detect trends and suggest when filesystem maintenance might be needed.