慶應義塾大学
2007年度 秋学期

ネットワーク・プログラミング(C言語)
Network Programming in C

2007年度秋学期 火曜日2時限
科目コード: 13070 / 2単位
カテゴリ:
開講場所:SFC
授業形態:講義
担当: Rodney Van Meter
E-mail: rdv@sfc.keio.ac.jp

第7回 11月27日
Lecture 7, November 27: Binary data encodings

Outline of This Lecture

Why?

Why do we need to be careful about binary data formats? Because different computers store data differently, and we want interoperability between computers, and we want to exchange data between them.

In fact, this is such a big problem that FTP has a binary mode that foces files to be transferred exactly as they are; it is primarily useful between machines of the same type and usually same operating system.

Big Endian and Little Endian

big-endian format little-endian format

The article Writing Endian-Independent Code in C from IBM's developerWorks website is very good; we will work from it.

Writing files for transportability

Data files may contain one of two types of data: text (typically ASCII, or JIS, Shift-JIS, etc.) or binary data, meaning numbers. One useful tool for seeing the contents of files is the Unix utility od.

[rdv@localhost network-programming-in-c]$ od -t x4z lec07.html  | more
0000000 783f3c0a 76206c6d 69737265 223d6e6f  >.<?xml version="<
0000020 22302e31 636e6520 6e69646f 69223d67  >1.0" encoding="i<
0000040 322d6f73 2d323230 3f22706a 213c0a3e  >so-2022-jp"?>.<!<
0000060 54434f44 20455059 6c6d7468 42555020  >DOCTYPE html PUB<

[rdv@localhost network-programming-in-c]$ od -t x1z lec07.html  | more
0000000 0a 3c 3f 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22  >.<?xml version="<
0000020 31 2e 30 22 20 65 6e 63 6f 64 69 6e 67 3d 22 69  >1.0" encoding="i<
0000040 73 6f 2d 32 30 32 32 2d 6a 70 22 3f 3e 0a 3c 21  >so-2022-jp"?>.<!<
0000060 44 4f 43 54 59 50 45 20 68 74 6d 6c 20 50 55 42  >DOCTYPE html PUB<

The simplest way to make data transportable is to convert it from binary to ASCII text, using a function like printf() instead of write(). However, this approach has two large disadvantages:

浮動小数点
Floating Point

Floating point numbers represent a special problem, because different types of processors may define the numbers differently. Simply maintaining byte order is not enough.

XDR

Sun Microsystems developed NFS, the Network File System, in the early 1980s. They needed a way to correctly transfer data among their machines. (Given that all of their machines at the time used the same processor type, this was a visionary, and very fortunate, decision.) They developed XDR, the eXternal Data Representation, for use in RPC, or Remote Procedure Call. XDR serializes data of different types:

Network APIs

The most basic APIs for converting data from host format to network format and back are htonl() and ntohl() for 32-bit integers, and htons() and ntohs() for 16-bit integers.

Again, for examples, we will work from the IBM article.

When transporting mixed types of data, such as text and binary, the sender and receiver must agree on the boundaries of the data.

Other Languages and Approaches

Some of you may have programmed in Java already. If so, you may not even have been aware of the issues around binary data representations, because Java RMI takes care of them for you. Sun's version of Java stores data in big-endian format internally, and uses it for transport across the network, as well.

CORBA is a distributed object system that also handles the details internally for you. CORBA is quite complex.

宿題
Homework

This week's homework (submit via email):

  1. Compile the example program from the IBM website and report whether your machine is big endian or little endian.
  2. Adapt your TCP client and server programs to use htonl() and ntohl() to send and receive binary data correctly, as well.

Next Lecture

第7回 11月13日
Lecture 7, November 13: Binary data encodings

Additional Information

その他