Compare Files with an additional twist

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
SIMMS7400
Posts: 539
Joined: 07 Jan 2016 07:47

Compare Files with an additional twist

#1 Post by SIMMS7400 » 29 Sep 2016 14:43

Hi Folks -

There is a division at one my clients that is requiring an export from a source system which looks like this the following:

Code: Select all

"Name","Project_Activity_Alias"

"PFP-C0000200","Compound Support 00013 AD-4833 MET - AD-4833 MET Support - CVM"
"PFP-C0078100","AD-4833MET Analytical Method Dev. & Valid. - AD-4833 MET Support - CVM"
"PFP-C0078200","Compound Support 00015 AD-4833 SU - AD-4833 SU Support - CVM"
"PFP-A0002100","Compound Support 00012 AD-4833 - AD-4833 T2D"
"PFP-A0022800","Reg Filing 00012 AD-4833 00050 Type 2 diabetes - AD-4833 T2D"
"PFP-A0043500","Epidemiology Study (AD-4833) - AD-4833 T2D"
"PFP-A0043900","Observational (EU) (AD-4833) - AD-4833 T2D"
"PFP-A0146300","Kaiser EMEA (AD-4833) T2D - AD-4833 T2D"


After we give them the FIRST export, going forward, they ONLY want a file that consists of any new lines, as well as any changes made to the Project_Activity_Alias.

So what I am thinking is the following:

I have ability to add an additional column to the above export called "Last Changed On". Then, each month when they request a new file, Id export from my source system. It would look like such:

Code: Select all

"Name","Project_Activity_Alias","Last Changed On"

"PFP-C0000200","Compound Support 00013 AD-4833 MET - AD-4833 MET Support - CVM","5/5/2016 3:04:50 PM"
"PFP-C0078100","AD-4833MET Analytical Method Dev. & Valid. - AD-4833 MET Support - CVM","10/21/2015 5:08:27 PM"
"PFP-C0078200","Compound Support 00015 AD-4833 SU - AD-4833 SU Support - CVM","10/21/2015 5:08:27 PM"
"PFP-A0002100","Compound Support 00012 AD-4833 - AD-4833 T2D","10/21/2015 5:08:27 PM"
"PFP-A0022800","Reg Filing 00012 AD-4833 00050 Type 2 diabetes - AD-4833 T2D","10/21/2015 5:08:27 PM"
"PFP-A0043500","Epidemiology Study (AD-4833) - AD-4833 T2D","10/21/2015 5:08:28 PM"
"PFP-A0043900","Observational (EU) (AD-4833) - AD-4833 T2D","10/21/2015 5:08:28 PM"
"PFP-A0146300","Kaiser EMEA (AD-4833) T2D - AD-4833 T2D","10/21/2015 5:08:28 PM"


What needs to happen first is cycle through the original export and compare to the most recent export, writing net new lines to a file. Then, second pass, it needs to compare the lines that exist in BOTH files and then check the Last Changed On date. If the Last Changed On Date in the most recent export is newer than the Last Changed On date in the original export , spool that entire line to the new file.

So, let's assume this is the new export file:

Code: Select all

"Name","Project_Activity_Alias","Last Changed On"

"PFP-C0000200","Compound Support 00013 AD-4833 MET - AD-4833 MET Support - CVM","9/29/2016 3:04:50 PM"
"PFP-C0078100","AD-4833MET Analytical Method Dev. & Valid. - AD-4833 MET Support - CVM","10/21/2015 5:08:27 PM"
"PFP-C0078200","Compound Support 00015 AD-4833 SU - AD-4833 SU Support - CVM","10/21/2015 5:08:27 PM"
"PFP-A0002100","Compound Support 00012 AD-4833 - AD-4833 T2D","10/21/2015 5:08:27 PM"
"PFP-A0022800","Reg Filing 00012 AD-4833 00050 Type 2 diabetes - AD-4833 T2D","10/21/2015 5:08:27 PM"
"PFP-A0043500","Epidemiology Study (AD-4833) - AD-4833 T2D","10/21/2015 5:08:28 PM"
"PFP-A0043900","Observational (EU) (AD-4833) - AD-4833 T2D","10/21/2015 5:08:28 PM"
"PFP-A0146300","Kaiser EMEA (AD-4833) T2D - AD-4833 T2D","9/29/2016 5:08:28 PM"
"PFP-A0189400","Project Management Activity (AD-4833) T2D - AD-4833 T2D","10/21/2015 5:08:28 PM"
"PFP-A1040400","Project Management Activity (TAK-331)  IBD (Inflammatory Bowel Disease) - TAK-331 - IBD (Inflammatory Bowel Disease)","4/19/2016 2:13:55 PM"
"PFP-A1043400","P1 SRD Study (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)","9/19/2016 7:47:05 PM"
"PFP-C1019800","CTM Mfg. (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)","7/20/2016 5:00:07 PM"
"PFP-C1023400","Compound Support TAK-331 - TAK-331 - IBD (Inflammatory Bowel Disease)","4/19/2016 2:13:49 PM"
"PFP-C1023500","Analytical Method Dev. & Valid (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)","4/20/2016 1:12:16 PM"
"PFP-C1023700","Process/Formulation Dev (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)","4/20/2016 1:12:17 PM"
"PFP-C1023800","Stability Expense (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)","4/20/2016 1:12:17 PM"
"PFP-C1023900","Tech Transfer (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)","4/20/2016 1:12:17 PM"
"PFP-A1043700","Project Management Activity (Curosurf) Multiple cancer - TAK-164 Multiple Cancer","5/23/2016 1:57:38 PM"
"PFP-A1043800","CXXXXX Ph1 Dose Esc/Exp (TAK-164) - TAK-164 Multiple Cancer","5/26/2016 7:56:16 PM"
"PFP-A1043900","CXXXXX Human ADME (TAK-164) - TAK-164 Multiple Cancer","5/26/2016 7:56:16 PM"


As you can see, there are news lines as well as newer Last Changed On dates from the original export.

Therefore, the file I'd essentially send would look like the following:

Code: Select all

"PFP-C0000200","Compound Support 00013 AD-4833 MET - AD-4833 MET Support - CVM"
"PFP-A0146300","Kaiser EMEA (AD-4833) T2D - AD-4833 T2D"
"PFP-A0189400","Project Management Activity (AD-4833) T2D - AD-4833 T2D"
"PFP-A1040400","Project Management Activity (TAK-331)  IBD (Inflammatory Bowel Disease) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-A1043400","P1 SRD Study (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1019800","CTM Mfg. (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023400","Compound Support TAK-331 - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023500","Analytical Method Dev. & Valid (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023700","Process/Formulation Dev (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023800","Stability Expense (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023900","Tech Transfer (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-A1043700","Project Management Activity (Curosurf) Multiple cancer - TAK-164 Multiple Cancer"
"PFP-A1043800","CXXXXX Ph1 Dose Esc/Exp (TAK-164) - TAK-164 Multiple Cancer"
"PFP-A0146300","Kaiser EMEA (AD-4833) T2D - AD-4833 T2D"


I'm hoping this is easily attainable.
Thanks!

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Compare Files with an additional twist

#2 Post by aGerman » 29 Sep 2016 15:33

Is there any chance to change the format of "Last Changed On"?

Batch doesn't know any date or time value type. It's only a "string of characters". If you want to compare it effectively you would need to have a nice time stamp like "20160929170828" instead of "9/29/2016 5:08:28 PM". Also it would have been helpful if it was in the first column.

Steffen

SIMMS7400
Posts: 539
Joined: 07 Jan 2016 07:47

Re: Compare Files with an additional twist

#3 Post by SIMMS7400 » 29 Sep 2016 15:40

Hi aGerman-

I can position Last Changed On column as the first one, however I need to see if I can modify the format.

Thanks!

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Compare Files with an additional twist

#4 Post by aGerman » 29 Sep 2016 15:48

Basically there are 3 requirements
1) Order yyyy mm dd HH MM SS
2) preceding zeros
3) 24 hours clock

It doesn't matter if there are any separators between. E.g. "2016-09-29 17:08:28" would be okay as well.

Steffen

Aacini
Expert
Posts: 1885
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: Compare Files with an additional twist

#5 Post by Aacini » 29 Sep 2016 19:11

I am confused by this problem. If I correctly understood your description, the original export files does NOT include any "Last Changed On" timestamp, so I don't understand how the inclusion of such data can help to identify new lines or lines that changed in the Project_Activity_Alias field.

An important point that you did not mentioned is the possibility that the values in Name field be unique keys; if this is true, then the code below solve your problem.

Code: Select all

@echo off
setlocal EnableDelayedExpansion

rem Load the lines of the last export file
for /F "tokens=1* delims=," %%a in (lastExport.txt) do set "line[%%~a]=%%~b"

rem Process the lines of the new export file
for /F "tokens=1* delims=," %%a in (newExport.txt) do (
   if not defined line[%%~a] (
      rem Is a new line
      echo %%a,%%b
   ) else if "!line[%%~a]!" neq "%%~b" (
      rem A change was made to the Project_Activity_Alias
      echo %%a,%%b
   )
)

rem Prepare the files for the next cycle
REM del lastExport.txt
REM ren newExport.txt lastExport.txt

I used this file as lastExport.txt:

Code: Select all

"Name","Project_Activity_Alias"

"PFP-C0000200","Compound Support 00013 AD-4833 MET - AD-4833 MET Support - CVM"
"PFP-C0078100","AD-4833MET Analytical Method Dev. & Valid. - AD-4833 MET Support - CVM"
"PFP-C0078200","Compound Support 00015 AD-4833 SU - AD-4833 SU Support - CVM"
"PFP-A0002100","Compound Support 00012 AD-4833 - AD-4833 T2D"
"PFP-A0022800","Reg Filing 00012 AD-4833 00050 Type 2 diabetes - AD-4833 T2D"
"PFP-A0043500","Epidemiology Study (AD-4833) - AD-4833 T2D"
"PFP-A0043900","Observational (EU) (AD-4833) - AD-4833 T2D"
"PFP-A0146300","Kaiser EMEA (AD-4833) T2D - AD-4833 T2D"

... and this one as newExport.txt

Code: Select all

"Name","Project_Activity_Alias"

"PFP-C0000200","Compound Support 00013 AD-4833 MET - AD-4833 MET Support - CVM"
"PFP-C0078100","AD-4833MET Analytical Method Dev. & Valid. - AD-4833 MET Support - CVM"
"PFP-C0078200","Compound Support 00015 AD-4833 SU - AD-4833 SU Support - CVM"
"PFP-A0002100","Compound Support 00012 AD-4833 - AD-4833 T2D"
"PFP-A0022800","Reg Filing 00012 AD-4833 00050 Type 2 diabetes - AD-4833 T2D"
"PFP-A0043500","Epidemiology Study (AD-4833) - AD-4833 T2D"
"PFP-A0043900","Observational (EU) (AD-4833) - AD-4833 T2D"
"PFP-A0146300","Kaiser EMEA (AD-4833) T2D - AD-4833 T2D"
"PFP-A0189400","Project Management Activity (AD-4833) T2D - AD-4833 T2D"
"PFP-A1040400","Project Management Activity (TAK-331)  IBD (Inflammatory Bowel Disease) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-A1043400","P1 SRD Study (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1019800","CTM Mfg. (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023400","Compound Support TAK-331 - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023500","Analytical Method Dev. & Valid (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023700","Process/Formulation Dev (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023800","Stability Expense (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023900","Tech Transfer (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-A1043700","Project Management Activity (Curosurf) Multiple cancer - TAK-164 Multiple Cancer"
"PFP-A1043800","CXXXXX Ph1 Dose Esc/Exp (TAK-164) - TAK-164 Multiple Cancer"
"PFP-A1043900","CXXXXX Human ADME (TAK-164) - TAK-164 Multiple Cancer"

... and this is the output:

Code: Select all

"PFP-A0189400","Project Management Activity (AD-4833) T2D - AD-4833 T2D"
"PFP-A1040400","Project Management Activity (TAK-331)  IBD (Inflammatory Bowel Disease) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-A1043400","P1 SRD Study (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1019800","CTM Mfg. (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023400","Compound Support TAK-331 - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023500","Analytical Method Dev. & Valid (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023700","Process/Formulation Dev (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023800","Stability Expense (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023900","Tech Transfer (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-A1043700","Project Management Activity (Curosurf) Multiple cancer - TAK-164 Multiple Cancer"
"PFP-A1043800","CXXXXX Ph1 Dose Esc/Exp (TAK-164) - TAK-164 Multiple Cancer"
"PFP-A1043900","CXXXXX Human ADME (TAK-164) - TAK-164 Multiple Cancer"

Antonio

SIMMS7400
Posts: 539
Joined: 07 Jan 2016 07:47

Re: Compare Files with an additional twist

#6 Post by SIMMS7400 » 30 Sep 2016 02:00

Aacini -

Wow - this works absolutely perfect! Thank you!

I had another meeting with that team last night and they want to add additional columns to the export. If not net new Key, they want to know which column value was changed between the Old export to the New export. This may be too difficult, but is there a way to capture that as just an additional column. Something like "Column Heading was changed from old value is now new value" ?

So if we have OriginalExport:

Code: Select all

"Name","Project_Activity_Alias"

"PFP-C0000200","Compound Support 00013 AD-4833 MET - AD-4833 MET Support - CVM"
"PFP-C0078100","AD-4833MET Analytical Method Dev. & Valid. - AD-4833 MET Support - CVM"
"PFP-C0078200","Compound Support 00015 AD-4833 SU - AD-4833 SU Support - CVM"
"PFP-A0002100","Compound Support 00012 AD-4833 - AD-4833 T2D"
"PFP-A0022800","Reg Filing 00012 AD-4833 00050 Type 2 diabetes - AD-4833 T2D"
"PFP-A0043500","Epidemiology Study (AD-4833) - AD-4833 T2D"
"PFP-A0043900","Observational (EU) (AD-4833) - AD-4833 T2D"
"PFP-A0146300","Kaiser EMEA (AD-4833) T2D - AD-4833 T2D"


And then NewExport:

Code: Select all

"Name","Project_Activity_Alias"

"PFP-C0000200","UPDATE TO THIS ALIAS -Compound Support 00013 AD-4833 MET - AD-4833 MET Support - CVM"
"PFP-C0078100","AD-4833MET Analytical Method Dev. & Valid. - AD-4833 MET Support - CVM"
"PFP-C0078200","Compound Support 00015 AD-4833 SU - AD-4833 SU Support - CVM"
"PFP-A0002100","Compound Support 00012 AD-4833 - AD-4833 T2D"
"PFP-A0022800","Reg Filing 00012 AD-4833 00050 Type 2 diabetes - AD-4833 T2D"
"PFP-A0043500","Epidemiology Study (AD-4833) - AD-4833 T2D"
"PFP-A0043900","Observational (EU) (AD-4833) - AD-4833 T2D"
"PFP-A0146300","Kaiser EMEA (AD-4833) T2D - AD-4833 T2D"
"PFP-A0189400","Project Management Activity (AD-4833) T2D - AD-4833 T2D"
"PFP-A1040400","Project Management Activity (TAK-331)  IBD (Inflammatory Bowel Disease) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-A1043400","P1 SRD Study (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1019800","CTM Mfg. (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023400","Compound Support TAK-331 - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023500","Analytical Method Dev. & Valid (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023700","Process/Formulation Dev (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023800","Stability Expense (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-C1023900","Tech Transfer (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)"
"PFP-A1043700","Project Management Activity (Curosurf) Multiple cancer - TAK-164 Multiple Cancer"
"PFP-A1043800","CXXXXX Ph1 Dose Esc/Exp (TAK-164) - TAK-164 Multiple Cancer"
"PFP-A1043900","CXXXXX Human ADME (TAK-164) - TAK-164 Multiple Cancer"



The ResultsFile should be:

Code: Select all

"PFP-C0000200","UPDATE TO THIS ALIAS -Compound Support 00013 AD-4833 MET - AD-4833 MET Support - CVM"," Project_Activity Alias was changed from Compound Support 00013 AD-4833 MET - AD-4833 MET Support - CVM to UPDATE TO THIS ALIAS -Compound Support 00013 AD-4833 MET - AD-4833 MET Support - CVM"
"PFP-A0189400","Project Management Activity (AD-4833) T2D - AD-4833 T2D",""
"PFP-A1040400","Project Management Activity (TAK-331)  IBD (Inflammatory Bowel Disease) - TAK-331 - IBD (Inflammatory Bowel Disease)",""
"PFP-A1043400","P1 SRD Study (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)",""
"PFP-C1019800","CTM Mfg. (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)",""
"PFP-C1023400","Compound Support TAK-331 - TAK-331 - IBD (Inflammatory Bowel Disease)",""
"PFP-C1023500","Analytical Method Dev. & Valid (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)",""
"PFP-C1023700","Process/Formulation Dev (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)",""
"PFP-C1023800","Stability Expense (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)",""
"PFP-C1023900","Tech Transfer (TAK-331) - TAK-331 - IBD (Inflammatory Bowel Disease)",""
"PFP-A1043700","Project Management Activity (Curosurf) Multiple cancer - TAK-164 Multiple Cancer",""
"PFP-A1043800","CXXXXX Ph1 Dose Esc/Exp (TAK-164) - TAK-164 Multiple Cancer",""
"PFP-A1043900","CXXXXX Human ADME (TAK-164) - TAK-164 Multiple Cancer",""


If the latest requirements cant be captured I understand.

Aacini
Expert
Posts: 1885
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: Compare Files with an additional twist

#7 Post by Aacini » 30 Sep 2016 14:01

SIMMS7400 wrote:I had another meeting with that team last night and they want to add additional columns to the export. If not net new Key, they want to know which column value was changed between the Old export to the New export. This may be too difficult, but is there a way to capture that as just an additional column. Something like "Column Heading was changed from old value is now new value" ?

Just add the additional column with desired text and values in the ECHO command that show a line when a change was made to the Project_Activity_Alias (hint: it is below the line "rem A change was made to the Project_Activity_Alias" :shock: ), and just a couple quotes when it is a new Key (below "rem Is a new line")... :?

Antonio

SIMMS7400
Posts: 539
Joined: 07 Jan 2016 07:47

Re: Compare Files with an additional twist

#8 Post by SIMMS7400 » 01 Oct 2016 17:53

Will do, thank you! Its working great!

Post Reply