I apologize in advance if this has been discussed in another post.
My broad problem: I have a directory of audio files. When a file gets updated by an XML software program, the old audio file does not get automatically deleted, which leads to a ton of files sitting around. We need a script that can go in and move these old files out into another folder since we can't alter the software code.
My approach: After much struggles, I managed to get a list of the files in the directory (we'll call it filelist.txt). I was able to use findstr to search the .xml files for any line that looks for one of these audio files and spit that into a list as well (we'll call it xmllist.txt). My goal is to just compare these two lists and find the unique items in the filelist.txt and use that to create a script that will move those files to another directory.
Why I can't get it to work: The xmllist.txt file is spitting out the entire line when it finds a result. I just want it to spit out the file name+extension, but it's got all the other garbage that is on that particular line. This makes it impossible (for me) to compare the two files and get a usable list of unique audio files.
As a work around, I've just been copying the xmllist.txt results into Excel, separating the data and using conditional formatting to find the unique data then make a batch to move the files using that data. However, this process needs to be as automated as possible so that other employees can run it if I get flattened by a bus tonight.
Any help would be appreciated, and remember...I'm a beginner as of 2 days ago. I basically took on this project because I knew more about this than anyone else here.
Thanks a great deal!
Beginner problem w/ findstr
Moderator: DosItHelp
Re: Beginner problem w/ findstr
Can we take a step backward and examine the files?
It may be easier to process the list of files and weed out the duplicates that way.
Are the audio files in a subdirectory tree?
When a file gets processed by your software, does it keep the same name and change the extension?
What are the extensions of filetypes in the tree?
Can you give us a small example of the directory list with the files concerned?
It may be easier to process the list of files and weed out the duplicates that way.
Are the audio files in a subdirectory tree?
When a file gets processed by your software, does it keep the same name and change the extension?
What are the extensions of filetypes in the tree?
Can you give us a small example of the directory list with the files concerned?
Re: Beginner problem w/ findstr
Are the audio files in a subdirectory tree?
They're all in one directory folder
When a file gets processed by your software, does it keep the same name and change the extension?
The file has a date in it, so basically the file would change from filename_6_22_2012 to filename_6_25_2012. The extensions always stay the same. However, there are circumstances where we just stop using a file alltogether. It doesn't get replaced. It's just not necessary anymore.
What are the extensions of filetypes in the tree?
.swf
Can you give us a small example of the directory list with the files concerned?
I don't think I can do that (security reasons here at work). However, it's nothing really special. In a given directory there are just a bunch of .swf files with a variety of file names. Some have the date as mentioned above, some have no logical naming convention as they were named manually.
Thanks again for your help.
They're all in one directory folder
When a file gets processed by your software, does it keep the same name and change the extension?
The file has a date in it, so basically the file would change from filename_6_22_2012 to filename_6_25_2012. The extensions always stay the same. However, there are circumstances where we just stop using a file alltogether. It doesn't get replaced. It's just not necessary anymore.
What are the extensions of filetypes in the tree?
.swf
Can you give us a small example of the directory list with the files concerned?
I don't think I can do that (security reasons here at work). However, it's nothing really special. In a given directory there are just a bunch of .swf files with a variety of file names. Some have the date as mentioned above, some have no logical naming convention as they were named manually.
Thanks again for your help.
Re: Beginner problem w/ findstr
Speerdo wrote:I was able to use findstr to search the .xml files for any line that looks for one of these audio files and spit that into a list as well (we'll call it xmllist.txt).
[...]
Why I can't get it to work: The xmllist.txt file is spitting out the entire line when it finds a result.
Rather than findstr, the better way would be to write a simple .xslt to extract whatever info you need from the .xml file into the .txt file. You can run .xsl transformations at the command line using for example msxsl.exe (http://www.microsoft.com/en-us/download/details.aspx?id=21714).
I could provide some hints if you posted a sample .xml file.
Liviu
Re: Beginner problem w/ findstr
Liviu,
thanks for the help! I will read up on the info you posted and do my best! If I need further assistance I will post again.
Thanks again for the help. I'm always blown away by the willingness of complete strangers on forums to help out those in need.
thanks for the help! I will read up on the info you posted and do my best! If I need further assistance I will post again.
Thanks again for the help. I'm always blown away by the willingness of complete strangers on forums to help out those in need.
Re: Beginner problem w/ findstr
Speerdo wrote:Why I can't get it to work: The xmllist.txt file is spitting out the entire line when it finds a result. I just want it to spit out the file name+extension, but it's got all the other garbage that is on that particular line. This makes it impossible (for me) to compare the two files and get a usable list of unique audio files.
.swf are video files, not audio - yeah??
Anyway, say you have an input line that reads something like this (it's mangled but the importance is getting the path\filename)
<head><body>d:\folder\filename_6_2_2011.swf</body></head>
then you can use < and > as delimiters and count the tokens to retrieve your info, as < and > do not appear in path\filenames.
Try this:
Code: Select all
@echo off
set "var=<head><body>d:\folder\filename_6_2_2011.swf</body></head>"
for /f "tokens=3 delims=<>" %%a in ("%var%") do set filename=%%a
echo %filename%
pause
I hope that helps.
Liviu's idea might be better but you might gain some insight to this anyway.