GZipStream is a great tool to compress/decompress byte arrays of serialized data. But it has an issue which can make you hanging for some time. Here I describe the issue with mismatch of compressed and decompressed data and provide a solution.
Compressing data could be useful when you need to transfer a data across distributed system (e.g. between web service and its client applications). The first time when I need in that tool was when I was writing the tool which generates a huge sample of statistical data which should be stored in a file and transferred via email. Thanks to Microsoft guys, they have already thought about that and created the useful class GZipStream (find more in MSDN). I didn't find any alternative for that tool so I'd rather say this is most proper tool for c# to extend your application with compression/decompression ability.
Data Mismatch
While I was implementing my code I have been stacked for some time on unknown behavior for that time of the class. After decompressing a new instance of byte array didn't match to its original version. The last byte was zero when its original version had a non-zero value. As I figured out the reason if that GZipStream has internal cache which must be closed before using of a compressed array.
The sample below shows the issue and what caused it. The incorrect version of the code sample:
using (MemoryStream ms = new MemoryStream()) { using (GZipStream zip = new GZipStream(ms, CompressionMode.Compress)) { zip.Write(uncompressedData, 0, uncompressedData.Length); descriptor.Size = uncompressedData.Length; descriptor.Data = ms.ToArray(); } }
At the time when I call ms.ToArray(); ZIP stream is not finished its work and unable to store all data to the memory stream. This the reason of my issue which can be solved by closing ZIP stream before storing byte array. I rewrote the code in this way:
using (MemoryStream ms = new MemoryStream()) { using (GZipStream zip = new GZipStream(ms, CompressionMode.Compress)) { zip.Write(uncompressedData, 0, uncompressedData.Length); zip.Close(); } descriptor.Size = uncompressedData.Length; descriptor.Data = ms.ToArray(); }
Code Snippet
The class ZipData can be used to transfer compressed data between tiers.
using System; using System.Collections.Generic; using System.Text; using System.IO; using System.IO.Compression; using System.Runtime.Serialization.Formatters.Binary; using System.Reflection; namespace GZipTest { public class ZipData { [Serializable] private class ZippedDataDescriptor { public int Size; public byte[] Data; } public static byte[] Zip(byte[] uncompressedData) { if (uncompressedData == null) throw new ArgumentNullException("uncompressedData", "Uncompressed data must be specified and not null."); ZippedDataDescriptor descriptor = new ZippedDataDescriptor(); using (MemoryStream ms = new MemoryStream()) { using (GZipStream zip = new GZipStream(ms, CompressionMode.Compress)) { zip.Write(uncompressedData, 0, uncompressedData.Length); zip.Close(); } descriptor.Size = uncompressedData.Length; descriptor.Data = ms.ToArray(); } using (MemoryStream ms = new MemoryStream()) { new BinaryFormatter().Serialize(ms, descriptor); return ms.ToArray(); } } public static byte[] UnZip(byte[] comressedData) { if (comressedData == null) throw new ArgumentNullException("comressedData", "Comressed data must be specified and not null."); ZippedDataDescriptor descriptor = null; using (MemoryStream ms = new MemoryStream(comressedData)) { object descriptorData = new BinaryFormatter().Deserialize(ms); if (descriptorData == null) throw new ArgumentNullException("descriptorData", "A descriptor of the zipped data is invalid."); descriptor = (ZippedDataDescriptor)descriptorData; } try { using (MemoryStream ms = new MemoryStream(descriptor.Data)) { using (GZipStream zip = new GZipStream(ms, CompressionMode.Decompress)) { byte[] uncompressedData = new byte[descriptor.Size]; zip.Read(uncompressedData, 0, descriptor.Size); return uncompressedData; } } } catch (Exception err) { throw; } } } }