I am running MATLAB 2010b, win7, 64 bit, quod core, 16GB RAM.
I have a large number of csv.gz files, between 200MB and 6GB in size on my local drive.
myPathData = C:\myFile.csv.gz
I need to extract the file. using tic/toc, for a 200MB file
f = char(gunzip(myPathData));
the time taken is 152 minutes!
This is too slow for me.
I can manually extract the file using WINRAR in 13 mins. I try and call winrar from MATLAB but it fails:
cd('C:\Program Files\WinRAR\');[status,result] = system(['UnRAR.exe' ' ' 'e' ' ' myPathData]); cd('V:\MatlabCode');
with the error message "csv.gz is not RAR archive" and "No files to extract".
I see the problem has also been posed here:
Does anyone know how I can dramatically speed up the unzipping? (My only thought is to write something in C# (like below) and then call that exe from system.m). Surely there is a better way?
Many Thanks!
using System;using System.Collections.Generic;using System.Linq;using System.Text;using System.IO;using System.IO.Compression;namespace ohMyGodICantBelieveIt{ public class unzipper { //http://msdn.microsoft.com/en-us/library/system.io.compression.gzipstream.aspx public static void Decompress(FileInfo fi) { // Get the stream of the source file. using (FileStream inFile = fi.OpenRead()) { // Get original file extension, for example "doc" from report.doc.gz. string curFile = fi.FullName; string origName = curFile.Remove(curFile.Length - fi.Extension.Length); //Create the decompressed file. using (FileStream outFile = File.Create(origName)) { using (GZipStream Decompress = new GZipStream(inFile, CompressionMode.Decompress)) { // Copy the decompression stream into the output file. Decompress.CopyTo(outFile); } } } } } }
Best Answer