Word to pdf python linux

Converting docx to pdf with pure python (on linux, without libreoffice)

I’m dealing with a problem trying to develop a web-app, part of which converts uploaded docx files to pdf files (after some processing). With python-docx and other methods, I do not require a windows machine with word installed, or even libreoffice on linux, for most of the processing (my web server is pythonanywhere — linux but without libreoffice and without sudo or apt install permissions). But converting to pdf seems to require one of those. From exploring questions here and elsewhere, this is what I have so far:

import subprocess try: from comtypes import client except ImportError: client = None def doc2pdf(doc): """ convert a doc/docx document to pdf format :param doc: path to document """ doc = os.path.abspath(doc) # bugfix - searching files in windows/system32 if client is None: return doc2pdf_linux(doc) name, ext = os.path.splitext(doc) try: word = client.CreateObject('Word.Application') worddoc = word.Documents.Open(doc) worddoc.SaveAs(name + '.pdf', FileFormat=17) except Exception: raise finally: worddoc.Close() word.Quit() def doc2pdf_linux(doc): """ convert a doc/docx document to pdf format (linux only, requires libreoffice) :param doc: path to document """ cmd = 'libreoffice --convert-to pdf'.split() + [doc] p = subprocess.Popen(cmd, stderr=subprocess.PIPE, stdout=subprocess.PIPE) p.wait(timeout=10) stdout, stderr = p.communicate() if stderr: raise subprocess.SubprocessError(stderr) 

As you can see, one method requires comtypes , another requires libreoffice as a subprocess. Other than switching to a more sophisticated hosting server, is there any solution?

Источник

docx2pdf 0.1.8

Convert docx to pdf on Windows or macOS directly using Microsoft Word (must be installed).

Ссылки проекта

Статистика

Метаданные

Лицензия: MIT License (MIT)

Требует: Python >=3.5

Сопровождающие

Классификаторы

Описание проекта

docx2pdf

Convert docx to pdf on Windows or macOS directly using Microsoft Word (must be installed).

On Windows, this is implemented via win32com while on macOS this is implemented via JXA (Javascript for Automation, aka AppleScript in JS).

Install

brew install aljohri/-/docx2pdf 

CLI

usage: docx2pdf [-h] [--keep-active] [--version] input [output] Example Usage: Convert single docx file in-place from myfile.docx to myfile.pdf: docx2pdf myfile.docx Batch convert docx folder in-place. Output PDFs will go in the same folder: docx2pdf myfolder/ Convert single docx file with explicit output filepath: docx2pdf input.docx output.docx Convert single docx file and output to a different explicit folder: docx2pdf input.docx output_dir/ Batch convert docx folder. Output PDFs will go to a different explicit folder: docx2pdf input_dir/ output_dir/ positional arguments: input input file or folder. batch converts entire folder or convert single file output output file or folder optional arguments: -h, --help show this help message and exit --keep-active prevent closing word after conversion --version display version and exit 

Library

See CLI docs above (or in docx2pdf --help ) for all the different invocations. It is the same for the CLI and python library.

Jupyter Notebook

If you are using this in the context of jupyter notebook, you will need ipywidgets for the tqdm progress bar to render properly.

pip install ipywidgets jupyter nbextension enable --py widgetsnbextension `` 

Источник

Читайте также:  Отладка процессов в linux

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

License

AlJohri/docx2pdf

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Added modifications to convert .doc-files for windows

Git stats

Files

Failed to load latest commit information.

README.md

Convert docx to pdf on Windows or macOS directly using Microsoft Word (must be installed).

On Windows, this is implemented via win32com while on macOS this is implemented via JXA (Javascript for Automation, aka AppleScript in JS).

brew install aljohri/-/docx2pdf 
usage: docx2pdf [-h] [--keep-active] [--version] input [output] Example Usage: Convert single docx file in-place from myfile.docx to myfile.pdf: docx2pdf myfile.docx Batch convert docx folder in-place. Output PDFs will go in the same folder: docx2pdf myfolder/ Convert single docx file with explicit output filepath: docx2pdf input.docx output.pdf Convert single docx file and output to a different explicit folder: docx2pdf input.docx output_dir/ Batch convert docx folder. Output PDFs will go to a different explicit folder: docx2pdf input_dir/ output_dir/ positional arguments: input input file or folder. batch converts entire folder or convert single file output output file or folder optional arguments: -h, --help show this help message and exit --keep-active prevent closing word after conversion --version display version and exit 
from docx2pdf import convert convert("input.docx") convert("input.docx", "output.pdf") convert("my_docx_folder/")

See CLI docs above (or in docx2pdf —help ) for all the different invocations. It is the same for the CLI and python library.

Читайте также:  Disable all logging linux

If you are using this in the context of jupyter notebook, you will need ipywidgets for the tqdm progress bar to render properly.

pip install ipywidgets jupyter nbextension enable --py widgetsnbextension `` 

Источник

Convert Word DOCX or DOC to PDF in Python

Word to PDF Conversion Python

Word to PDF is one of the most popular and immensely performed document conversions. The DOCX or DOC files are converted to PDF format before they are printed or shared. In this article, we will automate Word to PDF conversion in Python. The steps and code samples will demonstrate how to convert Word DOCX or DOC to PDF with Python. Also, you will learn about different options to customize Word to PDF conversion.

Python Library for Word to PDF Conversion — Free Download#

For converting Word documents to PDF format, we will use Aspose.Words for Python. It is a feature-rich Python library for creating and manipulating Word documents. Moreover, it lets you convert DOCX and DOC files to PDF format with high fidelity. The library is hosted on PyPI and you can install it using the following pip command.

Convert Word DOCX to PDF in Python#

The following are the steps to convert a Word document to PDF in Python.

  • Load the Word document using Document class.
  • Convert Word document to PDF using Document.save() method.

The following code sample shows how to convert a Word DOCX file to PDF.

Python Word to PDF with a Particular Standard#

You can also specify the particular standard for the converted PDF document such as PDF/A. The following are the steps to specify the PDF standard in Word to PDF conversion using Python.

  • Load the Word document using Document class.
  • Create an object of PdfSaveOptions class and set PDF standard using PdfSaveOptions.compliance property.
  • Convert Word document to PDF using Document.save() method.

The following code sample shows how to set a particular standard in Word DOCX to PDF conversion.

Python DOCX to PDF — Convert Range of Pages#

You can also specify the range of pages you want to convert to PDF format. For this, you can use PdfSaveOptions.page_set property. The following code sample shows how to convert a range of pages in Word document to PDF.

DOC DOCX to PDF in Python — Apply Image Compression#

Aspose.Words for Python also lets you apply image compression in the converted PDF document. In addition, you can specify the JPEG quality for the images. The following are the steps to set image compression while converting a Word DOCX to PDF in Python.

  • Load the Word document using Document class.
  • Create an object of PdfSaveOptions class.
  • Set image compression using PdfSaveOptions.image_compression property.
  • Set JPEG quality using PdfSaveOptions.jpeg_quality property.
  • Convert Word document to PDF using Document.save() method.
Читайте также:  Linux монтируем сетевой диск

The following code sample shows how to set image compression in Word to PDF conversion.

Python DOCX to PDF Library — Get a Free Library License#

You can get a temporary license in order to use Aspose.Words for Python without evaluation limitations.

Conclusion#

In this article, you have learned how to convert Word DOCX or DOC files to PDF in Python. Moreover, you have seen different options to customize the DOC or DOCX to PDF conversion in Python. You can learn more about Aspose.Words for Python using documentation. In case you would have any questions, feel free to let us know via our forum.

See Also#

Источник

How to Convert DocX to Pdf in Python

convert docx to pdf python linux

Sometimes you may need to convert docx files to pdfs. In this article, we will look at how to convert docx to pdf using Python. We will use docx2pdf library for this purpose.

How to Convert DocX to Pdf in Python

Here are the steps to convert docx to pdf files. Please note, Docx2pdf is available only in Windows. It is not supported in Linux. In such cases, it is better to use an online docx to pdf convertor like SmallPDF.

1. Install docx2pdf

Open command prompt & run the following command to install docx2pdf

2. Convert Docx to Pdf using command line

Here’s the syntax of docx2pdf

In the above command, you need to specify the file path of docx file as first argument and the file path of pdf file to written as second argument.

Here’s an example to convert docx to pdf

C:\> docx2pdf C:\Project\test.docx C:\Project\test.pdf

We have mentioned absolute paths for both input and outpur files. If you don’t mention absolute paths above, then docx2pdf will look for docx files as well as write pdf files in your present working directory.

3. Bulk Conversion using command line

You can also bulk convert a folder of docx to pdf files by specifying the folder path as input.

C:\> docx2pdf C:\Project\data_files

In the above command, docx2pdf will convert all docx files present in /home/ubuntu/data_files into pdf files.

You may also specify different input and output paths in docx2pdf command.

C:\> docx2pdf C:\Project\data_files C:\Project\test_files

4. Docx to PDF conversion from program

You may also import docx2pdf library within python program and use convert function to convert docx to pdf files.

using docx2pdf import convert #convert a single docx file to pdf file in same directory convert(test.docx) #convert docx to pdf specifying input & output paths convert('C:\Project\test.docx','C:\Project\test.pdf') #bulk conversion of files convert('C:\Project\')

As you can see it is very easy to convert docx to pdf files in python.

Источник

Оцените статью
Adblock
detector