Restore name of a tar.gz archive
Related articles
Introduction
Here describes about how you can restore normal names for the most of a recovered tar.gz archives by photorec. First you need to create a file with a more detailed information about files, see: post recovery tasks.
Restored tar.gz files with photorec might look like ./recup_dir.996/f864593944_wmmaiload-1.0.5.tar.gz or ./recup_dir.996/f864589184.tar.gz.
The tar.gz archive includes inside a tar archive whose name is used to restore filename of the tar.gz if it is possible or name of a first folder/file from inside of the archive.
The ways that might be used to restore the name of the tar.gz archives:
- From part of a name by cutting everything before '_' in files with created by photorec with pattern: f864593944_wmmaiload-1.0.5.tar.gz
- From name of a tar inside of the tar.gz archive.
- From the name of the folder inside it. In many cases it usually compressed a whole folder that might be similar to the archive name.
Collect info data about tar.gz
Collecting needed data from inside of the archives in order to restore their names.
collect-info-about-tar-gz.sh
#!/bin/bash
FileName="$1"
IntCheck=$(7z t "$1" | grep ^"Everything is Ok"$ )
AddIt="False"
if [ 'XX'"$IntCheck" != 'XXEverything is Ok' ]; then ErrorFlag="Damaged";else ErrorFlag='OK';fi;
if [ 'XX'"$(echo "$FileName" | grep -i tar.gz$ )" != 'XX' ];then
AddIt='True'
BaseFileName=$(tar -tvf "$FileName" | head -1 | awk '{ print substr($0, index($0,$6)) }' )
OutOfOrig="$( echo "$1" | awk '{ print substr($0, index($0,$4)) }' )"
if [ 'XX'"$(echo "${BaseFileName}" | grep '/')" == 'XX' ]; then
BaseFileName="$(echo "${BaseFileName}"| sed 's/\///m')"
fi
fi
if [ $AddIt == 'True' ]; then
if [ 'XX'"${OutOfOrig}" != 'XX' ]; then IfOrig=$(echo "$OutOfOrig" | grep -v '/') ;fi
if [ 'XX'"${IfOrig}" != 'XX' ]; then BaseFileName="$IfOrig" ; fi
BF="$(echo "$BaseFileName" | sed 's/\///g' | sed 's/^\.//m' )"
echo "$1"'|'"${BF}"'.tar.gz''|'Errors:'|'$ErrorFlag'|' >> /tmp/tmpDB.txt
fi;
Run find -type f -name *.tar.gz -exec collect-info-about-tar-gz.sh "{}"\;
Restore filenames
This will restore file names based on collected ínfo about files.
restore-tar-gz-names.sh
#!/bin/bash
CountAll=0
Destination="$HOME/tar-gz"
mkdir -v "$(echo $Destination)" -p
ArrayFillCount=0;
while read line ; do
ArrayFillCount=$((ArrayFillCount+1))
ArrayOfFiles[$ArrayFillCount]=$line
done < /tmp/tmpDB.txt;
XX=${#ArrayOfFiles[@]}
echo $XX
while [ $XX != $CountAll ] ; do
FolderName="$(echo ${ArrayOfFiles[$CountAll]} | awk -F'|' '{print $2}' | cut -c1-4 | tr '[:lower:]' '[:upper:]' )"
PathToFile="$(echo ${ArrayOfFiles[$CountAll]} | awk -F'|' '{print $1}')"
FileName="$(echo ${ArrayOfFiles[$CountAll]} | awk -F'|' '{print $2}')"
#SourceFolder="$(echo $PathToFile | sed s/$(basename $PathToFile)//m )"
SourceFolder='./Sorted/'
if [ "$(echo $FolderName)" == '.TAR' ];then
FileName="$(echo ${ArrayOfFiles[$CountAll]} | awk -F'_' '{ print substr($0, index($0,$4))}'|awk -F'|' '{print $1}' )"
FolderName="$(echo $FileName | cut -c1-4 | tr '[:lower:]' '[:upper:]' )"
fi
if [ -f "$FileName" ]; then
FIX="$(echo $PathToFile | sed s/$(basename $PathToFile)//m )"
BadName="$(echo $PathToFile | awk -F"$FIX" '{print $2}' )"
if [ -d "${Destination}/BadName" ];then echo A > /dev/null;else mkdir "${Destination}/BadName";fi
cp -fv "${PathToFile}" "${Destination}"'/'BadName'/'"$BadName"
else
IfExist="$(echo $Destination/${SourceFolder}${FolderName})"
if [ -d "$IfExist" ]; then echo a > /dev/null;else mkdir -vp "$(echo $Destination/${SourceFolder}${FolderName})"; fi
if [ -f "$Destination/${SourceFolder}${FolderName}"'/'"$FileName" ];
then DupName="${FileName}"_Duplicate"${CountAll}";
IfExist="$(echo $Destination/Duples)"
if [ -d "$IfExist" ]; then echo a > /dev/null;
cp -fv "${PathToFile}" "$Destination/Duples"'/'"$DupName";
else
mkdir -vp "$(echo $Destination/Duples)";
cp -fv "${PathToFile}" "$Destination/Duples"'/'"$DupName";fi
else
cp -fv "${PathToFile}" "$Destination/${SourceFolder}${FolderName}"'/'"$FileName"
fi
fi
CountAll=$((CountAll+1))
echo $CountAll
done;
echo $XX
Files will be restored with pattern like $HOME/tar-gz/Sorted/MIXE where MIXE is the first 4 letters of a filename. You can adjust it with the cut -c1-4 command inside the script.
In the $HOME/tar-gz/BadName places damaged files or files where it was impossible to get filename.
In the $HOME/tar-gz/Duples places duplicates of files with pattern filename.tar.gz_Duplicate123 where 123 is a number of a processed file.