The ‘find’ command in Linux systems searches through a directory and return files that satisfy certain criteria. For instance, to find a file that contains the string ‘needle text’ in the ‘mydocs’ directory:
find mydocs -type f -exec grep -l "needle text" {} \;
The problem of this approach is that it would search through ALL files in this directory including the binary ones such as images, executables and zip packages. Sensibly, we would only want to search through text files for a specific string. If there are far too many of binary files in present, it’d be a significant waste of CPU usage and time to get what you want, because it’s totally unnecessary to go through the binary files.
To achieve this, use this version of the above command:
find mydocs -type f -exec grep -l "needle text" {} \; -exec file {} \; | grep text | cut -d ':' -f1
I asked the question at stackoverflow.com and peoro came up with this solution. It works great.
Basically, the bold part checks each file’s mime type information and only searches the files that have ‘text’ in its mime type description. According to the Linux ‘file’ command manual, we can be fairly sure that files with ‘text’ in its mime type string are text files AND all text files have ‘text’ in its mime type description string.
find mydocs -type f -exec grep -l "needle text" {} \;
The problem of this approach is that it would search through ALL files in this directory including the binary ones such as images, executables and zip packages. Sensibly, we would only want to search through text files for a specific string. If there are far too many of binary files in present, it’d be a significant waste of CPU usage and time to get what you want, because it’s totally unnecessary to go through the binary files.
To achieve this, use this version of the above command:
find mydocs -type f -exec grep -l "needle text" {} \; -exec file {} \; | grep text | cut -d ':' -f1
I asked the question at stackoverflow.com and peoro came up with this solution. It works great.
Basically, the bold part checks each file’s mime type information and only searches the files that have ‘text’ in its mime type description. According to the Linux ‘file’ command manual, we can be fairly sure that files with ‘text’ in its mime type string are text files AND all text files have ‘text’ in its mime type description string.