Compressed archive and unicode file names

May 3, 2008 at 6:05 PM
Edited May 14, 2008 at 8:59 AM
Memba Velodoc Outlook Add-In uses SharpZipLib to compress files as Zip archives.

Zip archives are not Unicode. A Zip archive is created with a code page and if the code page on the sender's computer is not the same as the code page on the recipient's computer, file names are not displayed properly.

For more information, see http://community.icsharpcode.net/search/SearchResults.aspx?q=Unicode+AND+sectionid%3a12&o=Relevance and more specifically http://community.icsharpcode.net/forums/p/4546/13084.aspx#13084

This is by design (Zip spec) and the only fix we can offer is to add more compression options using other compression algorithms.
The nice thing about Zip archives though is that windows explorer handles them without the need for installing an archiving tool like WinZip or WinRar.
Oct 4, 2008 at 11:56 PM

The Zip spec does allow for Unicode filenames.  The Zip spec as of Sept 2007 allows UTF-8.

It is true, though, that most zip files are not created with UTF-8 encoding.  Most zips, regardless of the tool used, are created with the "Default" code page of the machine on which they are created.

And unzipping, you have to know that code page, in order to get correct results.

 

Oct 5, 2008 at 11:26 AM
Thank you for the tip.

I have noted that you are the author of DotNet Zip Library available on Codeplex, so I assume you know your stuff and I have the following questions:
1) What are the strengths and weaknesses of your library compared to SharpZipLib? 
2) When do you plan implementing UTF-8 archived file names in your project (It seems that you have not yet implemented them)?
3) Is the Zip windows shell extension supporting UTF-8 archived files names (I do not think so) and if not, how do you expect such Zip archives to behave in Windows?

Thanks for the help.