Linux excel to csv

Convert xlsx to csv in Linux with command line

I’m looking for a way to convert xlsx files to csv files on Linux. I do not want to use PHP/Perl or anything like that since I’m looking at processing several millions of lines, so I need something quick. I found a program on the Ubuntu repos called xls2csv but it will only convert xls (Office 2003) files (which I’m currently using) but I need support for the newer Excel files. Any ideas?

Thinking that anything implemented with a scripting language is going to be slow by nature seems. a little misguided, particularly since the interesting libraries in those languages tend to have backends written in C.

Excel used to be limited to 65536 rows. Now it’s 1,048,576 (support.microsoft.com/kb/120596). it’s going to be tough to fit «sever millions of lines» in it. just saying.

. personally, I’d do this using the xlsv library for Python, but since scripting-based approaches are described as out of the question. shrug. (How is it a programming question if programmatic tools are excluded from the answer?)

@CharlesDuffy I’m currently using a PHP library to do this, and what takes xls2csv 1 second to do, takes php 10 minutes to do. Literally.

12 Answers 12

The Gnumeric spreadsheet application comes with a command line utility called ssconvert that can convert between a variety of spreadsheet formats:

$ ssconvert Book1.xlsx newfile.csv Using exporter Gnumeric_stf:stf_csv $ cat newfile.csv Foo,Bar,Baz 1,2,3 123.6,7.89, 2012/05/14,, The,last,Line 

Really the most hassle-free method of converting spreadsheets. Combined with a bash script, it will let you batch-process multiple files. for f in *.csv; do ssconvert «$f» «$.xlsx»; done The LibreOffice method could probably process other formats, but I could not make it work (it would simply open a blank file every time, even with the —headless argument).

@sebleblanc Not quite hassle-free. The installation is a pain given the number of dependencies (if you’re doing this on a headless server). So far gcc, intltool, zlib-devel, GTK. GTK requires glib, atk, pango, cairo, cairo-object, gdk-pixbuf-2.0.

I managed to install it on a headless debian server with apt-get install gnumeric —no-install-recommends . The only drawback is that it fires lots of warnings GConf-WARNING **: Client failed to connect to the D-BUS daemon when running. A simple ssconvert oldfile.xlsx newfile.csv > /dev/null 2>&1 will do the trick.

@hhh The separator option only works with txt export type. You can use this to print to stdout: ssconvert -O «separator=;» -T Gnumeric_stf:stf_assistant file.xlsx fd://1 .

You can do this with LibreOffice:

libreoffice --headless --convert-to csv $filename --outdir $outdir 

For reasons not clear to me, you might need to run this with sudo. You can make LibreOffice work with sudo without requiring a password by adding this line to you sudoers file:

users ALL=(ALL) NOPASSWD: libreoffice 

Allowing sudo to libreoffice for everyone without password is opening a can of worms. Please beware of the consequences, including the possibility to acquiring root permissions on a multi-user platform

Читайте также:  List all linux distros

/Applications/LibreOffice.app/Contents/MacOS/soffice —headless —convert-to csv $filename worked on OS X for me.

To convert to utf-8, preserving non-ascii characters, use instead —convert-to «csv:Text — txt — csv (StarCalc):44,34,76,1,1/1» . See open office wiki for details.

If you already have a desktop environment then I’m sure Gnumeric or LibreOffice would work well, but on a headless server (e.g. any cloud-based environment), they require dozens of dependencies that you also need to install.

I found this Python alternative: xlsx2csv

easy_install xlsx2csv xlsx2csv file.xlsx > newfile.csv 

It took two seconds to install and works like a charm.

If you have multiple sheets, you can export all at once, or one at a time:

xlsx2csv file.xlsx --all > all.csv xlsx2csv file.xlsx --all -p '' > all-no-delimiter.csv xlsx2csv file.xlsx -s 1 > sheet1.csv 

He also links to several alternatives built in Bash, Python, Ruby, and Java.

Works great, but I can get to run only as sudo ( IOError: [Errno 13] Permission denied: ‘/usr/local/lib/python2.7/dist-packages/prettytable-0.7.2-py2.7.egg/EGG-INFO/top_level.txt’ ). Now that I think about it, I got the same error with csvkit .

. Was working great for me and allowing the extraction of each sheet to individual files using the -s option — where libreoffice was not able to handle the size of the sheet, xlsx2csv had no problems

In Debian and Ubuntu there is the xlsx2csv package, so you don’t need to manually install it through easy_install but can use your package manager.

I have no idea how robust or feature-complete xlsx2csv is but it seems to be actively maintained and compared to installing Gnumeric on macOS via Homebrew which involves more than 30 dependencies and LibreOffice which is a several hundred MB download xlsx2csv has zero(!) dependencies, comes at just 50 KB and worked perfectly for my use case (converting the output of PaddleOCR to csv). Either install it with pip install xlsx2csv or download the latest release from the Repository and run xlsx2csv.py .

In Bash, I used this LibreOffice command (executable libreoffice ) to convert all my .xlsx files in the current directory:

for i in *.xlsx; do libreoffice --headless --convert-to csv "$i" ; done 

Close all your LibreOffice open instances before executing, or it will fail silently.

The command takes care of spaces in the filename.

I tried it again some years later, and it didn’t work. This question gives some tips, but the quickest solution was to run as root (or running a sudo libreoffice ). It is not elegant, but quick.

Use the command scalc.exe in Windows.

Make sure you close all openoffice windows before attempting this, as it will silently fail otherwise.

Читайте также:  Linux режим работы интерфейса

Also, on Windows, the command is scalc.exe rather than libreoffice . Worked for me today on current stable LO version.

Another option would be to use R via a small Bash wrapper for convenience:

xlsx2txt()< echo ' require(xlsx) write.table(read.xlsx2(commandArgs(TRUE)[1], 1), stdout(), quote=F, row.names=FALSE, col.names=T, sep="\t") ' | Rscript --vanilla - $1 2>/dev/null > xlsx2txt file.xlsx > file.txt 

If the .xlsx file has many sheets, the -s flag can be used to get the sheet you want. For example:

xlsx2csv "my_file.xlsx" -s 2 second_sheet.csv 

second_sheet.csv would contain the data of the second sheet in my_file.xlsx .

Using the Gnumeric spreadsheet application which comes which a commandline utility called ssconvert is indeed super simple:

find . -name '*.xlsx' -exec ssconvert -T Gnumeric_stf:stf_csv <> \; 

Above command ‘ssconvert’ only convert 65536 lines but I have more than one lacks lines, Can you able to help me?

If you are OK to run Java command line then you can do it with Apache POI HSSF’s Excel Extractor. It has a main method that says to be the command line extractor. This one seems to just dump everything out. They point out to this example that converts to CSV. You would have to compile it before you can run it but it too has a main method so you should not have to do much coding per se to make it work.

Another option that might fly but will require some work on the other end is to make your Excel files come to you as Excel XML Data or XML Spreadsheet of whatever MS calls that format these days. It will open a whole new world of opportunities for you to slice and dice it the way you want.

You can use executable libreoffice to convert your .xlsx files to csv:

libreoffice --headless --convert-to csv ABC.xlsx 

Argument —headless indicates that we don’t need GUI.

Источник

How to Convert xlsx to CSV Format in Linux

The windows-based Microsoft Excel application is known for its indisputable open XML spreadsheet files support. This same support also extends to XLSX file formats.

As you adapt or migrate to the Linux operating system environment, you will find the use of CSV or Comma-Separated file format a lot more convenient due to some of the following prime reasons:

  • Its adaptation to any text editor.
  • Its support by most database-oriented applications.
  • It is easily manipulated.
  • It is easily parsable.

A more practical scenario is using the CSV file format to quickly populate an application’s database. In this case, if your targeted data is in XLSX format, you will need to find a way of converting it to CSV before uploading the targeted file data to your database application.

This article will familiarize you with several approaches to achieving its objective.

How to Convert xlsx to CSV Using Gnumeric Tool in Linux

The GNOME-based nature of the Gnumeric application toolkit enables it to mirror basic Excel features like data imports and exports related to CSV, LaTex, OpenDocument, and HTML, among other formats.

Читайте также:  Linux echo text file

Install Gnumeric in Linux

You can install Gnumeric on your Linux operating system distribution environment from either of the following commands:

$ sudo apt-get install gnumeric [On Debian, Ubuntu and Mint] $ sudo yum install gnumeric [On RHEL/CentOS/Fedora and Rocky Linux/AlmaLinux] $ sudo emerge -a sys-apps/gnumeric [On Gentoo Linux] $ sudo pacman -S gnumeric [On Arch Linux] $ sudo zypper install gnumeric [On OpenSUSE]

Gnumeric application toolkit references its ssconvert command to successfully convert an XLSX formatted file to a CSV formatted file.

Converting XLSX to CSV Using Gnumeric

Consider the following XLSX file sample:

XLSX File in Linux

To convert it to CSV with Gnumeric spreadsheet program, we would implement the following command:

$ ssconvert --export-type=Gnumeric_stf:stf_csv file_example.xlsx gnumeric_converted.csv $ cat gnumeric_converted.csv

The cat command should display the resulting CSV file on your Linux terminal.

List CSV File Content

CVS files content can be displayed on the Linux terminal through the cat command whereas XLSX files cannot be displayed hence the output from the above screen capture.

How to Convert xlsx to CSV Using xlsx2csv Converter

The xlsx2csv command is an XLSX to CSV file converter is a Python-based application. You can install it on your Linux operating system distribution environment from either of the following commands:

Install xlsx2csv in Linux

$ sudo apt-get install xlsx2csv [On Debian, Ubuntu and Mint] $ sudo yum install xlsx2csv [On RHEL/CentOS/Fedora and Rocky Linux/AlmaLinux] $ sudo emerge -a sys-apps/xlsx2csv [On Gentoo Linux] $ sudo pacman -S xlsx2csv [On Arch Linux] $ sudo zypper install xlsx2csv [On OpenSUSE]

Converting XLSX to CSV Using xlsx2csv Converter

To use it on our sample XLSX file, we would implement the command in the following manner:

$ xlsx2csv file_example.xlsx > xlsx2csv_converted.csv

Convert XLXS to CSV in Linux

Using the cat command, we are able to output the content of the resulting CSV file on our Linux terminal.

$ cat xlsx2csv_converted.csv

View CSV File Content

How to Convert xlsx to CSV Using csvkit Tool

The csvkit command is an XLSX to CSV converter toolkit that is also from a Python-based library. It is user-friendly and light in performance and can be installed on your Linux operating system distribution environment from either of the following commands:

Install csvkit in Linux

$ sudo apt-get install csvkit [On Debian, Ubuntu and Mint] $ sudo yum install csvkit [On RHEL/CentOS/Fedora and Rocky Linux/AlmaLinux] $ sudo emerge -a sys-apps/csvkit [On Gentoo Linux] $ sudo pacman -S csvkit [On Arch Linux] $ sudo zypper install csvkit [On OpenSUSE]

Converting XLSX to CSV Using csvkit Command

To convert a file from XLSX to CSV format with csvkit, we will use its in2csv command as demonstrated below.

$ in2csv file_example.xlsx > csvkit_converted.csv $ cat csvkit_converted.csv

With these three approaches of converting XLSX files to CSV file formats, you should find one that makes your Linux computing experience comfortable enough.

You can explore more usage options of these XLSX to CSV conversion tools through their man pages:

$ man ssconvert $ man xlsx2csv $ man in2csv

A recommendation preference would be the xlsx2csv toolkit due to its association with fewer conversion warnings.

Источник

Оцените статью
Adblock
detector