TransWikia.com

python extract uncompressed data from 7z-file

Stack Overflow Asked by user3820991 on November 4, 2021

I have several csv-files, some of which are compressed but others are not, all in a 7z archive. I want to read the csv files and save the content in a database. However, whenever py7zlib attemts to read the data from a csv file that is actually not compressed, I get the error data error during decompression.

import os
import py7zlib

scr = r'Y:PathtoArchive'
z7file = 'ArchiveName.7z'

with open(os.path.join(scr,z7file),'rb') as f:
    archive = py7zlib.Archive7z(f)

    names = archive.filenames

    for mem in names:

        obj = archive.getmember(mem)
        print obj.compressed  # prints None for uncompressed data
        try:
            data = obj.read()
        except Exception as er:
            print er          # prints data error during decompression
                              # whenever obj.compressed is None

The error happens in

File "C:Anacondalibsite-packagespy7zlib.py", line 608, in read
data = getattr(self, decoder)(coder, data, level)
File "C:Anacondalibsite-packagespy7zlib.py", line 671, in _read_lzma
return self._read_from_decompressor(coder, dec, input, level, checkremaining=True, with_cache=True)
File "C:Anacondalibsite-packagespy7zlib.py", line 646, in _read_from_decompressor
tmp = decompressor.decompress(data)
ValueError: data error during decompression

So, how can I extract uncompressed data from a 7z-Archive?

2 Answers

You can try another library, py7zr, which also supports 7zip archive compression, decompression, encryption and decryption. https://pypi.org/project/py7zr

Answered by Zhou Hongbo on November 4, 2021

Though I couldn't really figure out what the problem seemed to be, I found a workaround that solved the ultimate goal to obtain the data from csv-files from a 7z-archive. 7-zip comes with a command line tool. Communicating with that tool via the subprocess module, I could automatically extract the files that I wihsed to extract without any problems

import subprocess
import py7zlib 

archiveman = r'c:Program Files7-zip7z' # 7z.exe comes with 7-zip
archivepath = r'C:Pathtoarchive.7z'

with open(archivepath,'rb') as f:
    archive = py7zlib.Archive7z(f)
    names = archive.filenames
    for name in names:
        _ = subprocess.check_output([archiveman, 'e', archivepath, '-o{}'.format(r'C:Destinationofcopy'), name])

The different commands that can be used with 7z can be found here.

Answered by user3820991 on November 4, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP