Hi everyone, I have a bunch of PDF files in a folder and more keep coming in and I need them to be converted into preferably Excel format but Text works too. Now I heard of AHK to be in a loop and opens the .bat file but that's as far as I got. I want to know if it is possible to convert a PDF using a batch file. Whenever I try googling it I just get the Adobe acrobat Pro batch system in the search. I need them in a readable format for MS Access to analyze the data inside. Would it just be save as a different extension? Any thoughts help!
Thanks,
Eduard
Converting PDF to Text or Excel
Moderator: DosItHelp
Re: Converting PDF to Text or Excel
PDF files can have the data internally as images or plain text etc
and the actual layout of the data is part of the scenario here.
Plain batch code can't extract data from PDF files but there are utilities that are designed to convert PDF format to other file types, and which can be scripted in a batch file.
Without samples of the files and corresponding samples of what format you need the output files in,
then it's not really possible to understand the exact requirement of the task.
and the actual layout of the data is part of the scenario here.
Plain batch code can't extract data from PDF files but there are utilities that are designed to convert PDF format to other file types, and which can be scripted in a batch file.
Without samples of the files and corresponding samples of what format you need the output files in,
then it's not really possible to understand the exact requirement of the task.
Re: Converting PDF to Text or Excel
At the moment I have no installed word but I think the last two versions of word are able to open a PDF file.
here's one my (pretty) old script that converts word to text and I think it can be used in this case - viewtopic.php?f=3&t=4755 .
Here's updated version that should allow saving a document as docx:
and it should be called like:
though I cant test it....
here's one my (pretty) old script that converts word to text and I think it can be used in this case - viewtopic.php?f=3&t=4755 .
Here's updated version that should allow saving a document as docx:
Code: Select all
'>nul 2>&1|| @copy /Y %windir%\System32\doskey.exe '.exe >nul
'&&@echo off && cls &&goto :end_vbs
Set WordApp = CreateObject("Word.Application")
WordApp.Visible = FALSE
'Open doc for reading
Set WordDoc = WordApp.Documents.Open(WScript.Arguments.Item(0),true)
'wdFormatText 2
'wdFormatUnicodeText 7
format = CInt(WScript.Arguments.Item(2) )
WordDoc.SaveAs WScript.Arguments.Item(1) ,format
WordDoc.Close()
WScript.Quit
:end_vbs
'& if "%~1" equ "-help" echo %~n0 word_document [ destination [-unuicode]|[-docx] ]
'& if "%~1" equ "" echo word document not given & exit /b 1
'& if not exist "%~f1" echo word document does not exist & exit /b 2
'& if "%~2" equ "" ( set "save_as=%~n1.txt") else ( set "save_as=%~2")
'& if exist "%~f2" del /s /q "%~f2"
'& if /i "%~3" equ "-unuicode" ( set "format=7") else ( set "format=2")
'& if /i "%~3" equ "-docx" ( set "format=16")
'& taskkill /im winword* /f >nul 2>&1
'& cscript /nologo /E:vbscript %~f0 "%~f1" "%save_as%" %format%
'& pause
'& rem del /q %windir%\System32\'.exe
and it should be called like:
doctool.bat "some.pdf" "savedAs.docx" -docx
though I cant test it....
Re: Converting PDF to Text or Excel
foxidrive wrote:PDF files can have the data internally as images or plain text etc
and the actual layout of the data is part of the scenario here.
Plain batch code can't extract data from PDF files but there are utilities that are designed to convert PDF format to other file types, and which can be scripted in a batch file.
Without samples of the files and corresponding samples of what format you need the output files in,
then it's not really possible to understand the exact requirement of the task.
So if I have adobe acrobat pro, then a bat file can convert all the PDFs automatically using Adobe?
Re: Converting PDF to Text or Excel
Eduard14 wrote:So if I have adobe acrobat pro, then a bat file can convert all the PDFs automatically using Adobe?
If it has a command line feature then it can be scripted, but you will first have to investigate to see if it will convert your PDF files in the way you want.
The script that npocmaka_ posted uses Microsoft Word. Perhaps you can test that script.