Read file1 and remove the lines in that file from file2

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
Yanta
Posts: 48
Joined: 01 Sep 2019 07:08

Read file1 and remove the lines in that file from file2

#1 Post by Yanta » 06 May 2020 23:10

Hello Again.
I have two files. One is the Windows hosts file. The other contains a list of entries I want to remove from the hosts file.

file1 is the file containing the lines to remove, file2 is %windir%\system32\drivers\etc\hosts.

File1 may contain (full or partial domains).

facebook
www.twitter.com
instagam.com

At present I've hard coded dozens on entries in a script and then done tests based on %USERNAME% and using find /i /v ...
I'm looking to automate the process so I can use shorter more efficient code to handle all users by simply reading a file in the %USERNAME% folder as the file1.

Here's an example of my current bloated code...

find /i /v "facebook" %windir%\System32\Drivers\Etc\Hosts > %windir%\System32\Drivers\Etc\Hosts1
del %windir%\System32\Drivers\Etc\Hosts
find /i /v "fb.com" %windir%\System32\Drivers\Etc\Hosts1 > %windir%\System32\Drivers\Etc\Hosts2
del %windir%\System32\Drivers\Etc\Hosts1
find /i /v "twitter.com" %windir%\System32\Drivers\Etc\Hosts2 > %windir%\System32\Drivers\Etc\Hosts3
del %windir%\System32\Drivers\Etc\Hosts2
find /i /v "instagram" %windir%\System32\Drivers\Etc\Hosts3 > %windir%\System32\Drivers\Etc\Hosts
del %windir%\System32\Drivers\Etc\Hosts3

How might one achieve this?

thanks

ShadowThief
Expert
Posts: 1160
Joined: 06 Sep 2013 21:28
Location: Virginia, United States

Re: Read file1 and remove the lines in that file from file2

#2 Post by ShadowThief » 07 May 2020 08:09

findstr has a /G flag that takes in a file which contains a list of strings to search for, so you can use

Code: Select all

findstr /V /G:file1 %windir%\system32\drivers\etc\hosts >hosts2
move hosts2 %windir%\system32\drivers\etc\hosts

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: Read file1 and remove the lines in that file from file2

#3 Post by dbenham » 07 May 2020 08:20

The solution is almost trivial with FINDSTR when using the /G option (ShadowThief missed the /L and /I options that are required to make the solution reliable)

Code: Select all

findstr /LIVG:"file1" "%windir%\System32\Drivers\Etc\Hosts" >"%windir%\System32\Drivers\Etc\Hosts.new"
move /y "%windir%\System32\Drivers\Etc\Hosts.new" "%windir%\System32\Drivers\Etc\Hosts" >nul
Note that this would not work reliably without the /I (ignore case) option due to a nasty FINDSTR bug when using multiple literal search strings. Thankfully domain names are not case sensitive. It is also important that domain names do not allow symbols like \ (backslash) because they also have odd behavior in FINDSTR literal searches.

The /L literal option is required due to the possibility of dots in your search. If you can guarantee that the first search string in your file1 does not contain a dot, then I suppose you could drop the /L option because FINDSTR only looks at the first search string to determine if it should be a literal or regular expression search.


Dave Benham

Yanta
Posts: 48
Joined: 01 Sep 2019 07:08

Re: Read file1 and remove the lines in that file from file2

#4 Post by Yanta » 07 May 2020 09:59

I spent hours messing with for loops... Didn't even think of using findstr.
I can't guarantee the first or any line will or won't have a dot. It's highly likely they will.
Thanks, I'll give it a go and let you know how it works out.

Yanta
Posts: 48
Joined: 01 Sep 2019 07:08

Re: Read file1 and remove the lines in that file from file2

#5 Post by Yanta » 07 May 2020 21:31

it works. A little too well.

There are some domains that rely on users making typos - sites like ttwitter.com

So when i have twitter.com in the source file, both twitter.com and ttwitter.com are removed from the target file. I get why this happens, but is there a way to prevent this?

Perhaps I need to do it on an exact match only? I can live with that

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: Read file1 and remove the lines in that file from file2

#6 Post by dbenham » 08 May 2020 02:14

Sure, just add the /X option. It is the same as using "^search$" in a regular expression, except it works with literal searches.

The only potential problem is it only recognizes end of line before carriage returns. So each line must be terminated by \r\n (carriage return / linefeed). It won't recognize end of line in unix style text files that terminate lines with \n.


Dave Benham

Yanta
Posts: 48
Joined: 01 Sep 2019 07:08

Re: Read file1 and remove the lines in that file from file2

#7 Post by Yanta » 08 May 2020 06:08

Hmmm. I must have missed something.

I tried /lixvg:<file>.... Don't /x and /v conflict? /x will write only the exact matches to the output file?

I actually do want my cake and to eat it too. There are times when partial matches are desirable, but with the possibility of bad domains being mixed with good, I think I might have to use exact matches in the source file. It's going to be more work, because some domains like graph.facebook.com are to be removed, but pixel.facebook.com should not be.

It's easy enough to specify 127.0.0.1 <domain> in the source file. Just have to watch out for lines with tabs and those with spaces.

I guess I could always replace tabs with spaces with Notepad++

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: Read file1 and remove the lines in that file from file2

#8 Post by dbenham » 08 May 2020 06:49

No, /X and /V work fine together - it lists lines that do not exactly match any search string. It is easier to think in the negative sense - If a line exactly matches any of the search strings, then it is rejected.

Yanta
Posts: 48
Joined: 01 Sep 2019 07:08

Re: Read file1 and remove the lines in that file from file2

#9 Post by Yanta » 12 May 2020 06:54

OMG!!! I'm a dummy... Flawed logic

The master Hosts file is on my PC. As various lists are updated on github they are applied to the file I have. So that file has everything. The process I've currently been working on REMOVES lines from Hosts based on lines in another text file.

I have another script which does a bunch of backups (I think there is another post here somewhere on that one). So it copies the Hosts file from my C:\drive to the backup drive. Its fine until the first time I remove lines; They are gone forever. Next time the backup is run the hosts file without those lines will be copied and eventually applied to various PCs.

Dumb!!! What I should be doing is commenting the lines not removing them.

So msftncsi.com should become #msftncsi.com, facebook.com should become #facebook.com, and so forth.

I think I have to change the backup not to copy the hosts file.I need to maintain a master copy which is separate to the live file.

Post Reply