![]()
Binary files on UsenetUUEncode, MIME and MultipartUsenet was actually designed for discussions only, just like e-mail. The problem is that articles may only contain out of 7 BIT chars, while binary files use 8 BIT chars. Since some older gateways only allow 6 BIT chars, two techniques got developed that transform three 8 BIT chars into four 6 BIT chars, what will increase the file size by about 33% percent. Such a huge file size blowup shows you, why Usenet is certainly not the best way to distribute your binary files. UUEncodeThe first and older technique is UUEncode. The data is encoded and placed directly into the article body. If now people download this post, their Usenet client will decode the file again and either displays it or allow you to save it to disc. Every Usenet client should support UUEncode. In case that's not valid for your client, you may decode the file with an external decoder. An encoded file always starts like this: begin 666 test.jpg And it ends like this: ` Everything in-between is the encoded file. To decode it with an external decoder, copy the whole encoded data (including start and end-lines) to your clipboard, create a new .UUE file (name doesn't play any role, since UUEncode saves the name of the original file), paste the whole clipboard content there and save everything to disc. Now you can apply an UUDecoder on this file. Windows users can use WinZIP for that purpose. The file will open like a ZIP archive and you can extract the encoded data as usual (in our example "test.jpg"). MIMEThe newer solution is named MIME and it's the same solution that had been chosen for e-mails. An advantage of MIME is that attachments are clearly separate from the message text and from each other (in case you have multiple attachments within a single post). A big disadvantage of MIME is that plenty of Usenet client still don't support it and people have to use external MIME decoders in that case. An encoded file always starts like this: ------=_NextPart_000_0009_01BFD33D.191D9C60 And it ends like this: ------=_NextPart_000_0009_01BFD33D.191D9C60-- The boundaries (in our case "_NextPart_" and a random number) might be different in your case, because Usenet clients are free to choose any boundary they like. As you can see above, MIME provides a lot of information about the encoded file. An interesting line is the "Content-Transfer-Encoding", which tells us that BASE64 got used for encoding. BASE64 is working similar than UUEncode and it's not the only encoding that MIME supports, but it's the one that is usually used for binary encoding on Usenet and in e-mails. Text files are not encoded with BASE64, but Quoted-Printable (where text chars stay nearly unaltered and only some specific chars are encoded). Binaries could also be encoded with Quoted-Printable, but that would increase their size by over 100%. Other encodings are usually not supported by Usenet clients. In case your client doesn't support MIME, but you want to decode a MIME file, copy the encoded file to clipboard (including all the information shown above), create a new .MIM file (again, name doesn't play a role), paste the clipboard content there and save everything to disc. Now you can apply a MIME decoder to that file. Windows users can also use WinZIP in that case, as it supports UUEncode and MIME. Open the file with WinZIP and extract like usual. MultipartSince big articles don't propagate very well and some servers even filter articles larger than a specific size (the average maximum size seems to be around 4 MB, but some already filter everything larger than 600 KB), a system called "Multipart Message" has been invented. In case an article with binary attachment would exceed a certain size, it's broken into smaller pieces. Those pieces can then travel through Usenet without bigger problems. Most Usenet clients can reunite those pieces again (what is necessary, otherwise you can't decode the binary data) and some even recognize multipart messages and reunite them automatically (e.g. Agent does that automatically). In any case, if you want to get the file, you must download ALL parts of a multipart message. Since Usenet isn't reliable, sometimes you won't be able to decode a file, because you only missed a single part, that's very annoying but happens more often. Identifying a Multipart Message is easy. All parts have always the same subject line, which ends with [xx/yy]. This means it is part xx out of yy parts. E.g. if you see a post that ends with [04/13], you know that this message is part of a Mulitpart Post, you know that it is part number four and you know that there are thirteen parts altogether. In case your client can reunite Multipart Messages, you can do that yourself. First find out whether the file is encoded with MIME or UUEncode (even though it's a Multipart Message it must still get encoded!). The best way to do that is downloading the first part, which must either contain a MIME or UUEncode header. Then you create a file and choose the extension accordingly (.MIM or .UUE) and paste the whole content of the first post to this file. Now you download the second part, copy the content of it and append it to the first part. Be careful, if the first part ends in the middle of a line, you must append the second part exactly at that position and NOT at new line. You continue with that procedure, until all parts are reunited and then you save your file to disc. Now you can decode it like described before. You see, that's a lot of work, but it will work. BTW, sometimes you also see [00/13], which doesn't meant that this is part 0 (the first part has the number one). This posts is usually a text post, e.g. some comments to the binary posts. When decoding a file manually, do NOT include this part, as it's not part of the binary file. |
Last edited 20.05.2001 by TGOS

![[The Usenet Newbie Project]](zzzpics/newbie-project.png)