How to use Protocol Buffers

Alternative to XML and databases, the Protocol Buffers format was created by Google to save structured data in a dual form, easy to read text, and binary, compact and easy to handle for programs.
The format is extensible.

Why Protocol Buffers?

This format has been defined by Google as an alternative to XML in order to deal more quickly files, especially for its own needs for its millions of servers consuming quantities of data.
The easiest way to process data, is that they are in the form of an object in memory and remain permanently in a file, where the principle of serialization, which conssits to save an object in its entirety in a file.
This can be done in XML, but as JSON provides a more direct way to do this in JavaScript, PB provides a direct means to do so for other programming languages like C++ and Java. JSON is based on dynamic variables of JavaScript, while PB defines the type of data.
JSON is better than XML to JavaScript, PB is better suited than XML for other languages because it has no the drawbacks of XML: large consumer of space and big work of data conversion. It retains its advantages: flexibility and easy reading.

Although PB only work yet with C++ and Java, contributors are working to bring it in other platforms including PHP and Silverlight. It should belong to the landscape of the Web in the near future.

How works PB?

The compiler provided by Google, from the description of a proto document, generates a class for software. This class contains methods that will automatically encode data in a binary file or load a binary file and decode it.
Later it will be possible to extend the definition and we can continue to read the files already created.

So you follow the following steps:
- You define the structure of the protocol.
- The compiler generates a class, which is a subclass of Message class.
- You write a program using this class.
- You compile it including the generated class and the protobuf library.
- Data can be saved in a proto file by a serialization inherited from the Message class.
- You can also create a Web service and set the channel to a client.
- Your data is exchanged between the client and server depending on your protocol.

Using Protocol Buffers

Here are the steps to create a first program with Protocol Buffers.

1) Download the archives

You will need two archives for the compiler and the library to include.
On Google's Protocol buffer site, choose the archives for your system.

Extract the content, which creates a Protobuf directory and let appears the protoc compiler.

2) Compile the protobuf library

The process depends on the language, the compiler and the system as explained in README.txt in the root directory of Protobuf extracted from the archive.
To illustrate our example we will use Visual Studio Express on Windows.

From VSE launch and load the project file located in the vsprojects directory.
Start compiling, it will probably be preceded by a conversion because the project is made compatible with older versions of VSE.
The makefile will recreate quantities of files, but what interests us is:

protobuf.lib
protobuf.dll

Possibly accompanied by protobuf.dll files with an extension added.

3) Create a development environment

For practical reasons, you should create a new directory for your project. You can call it "proto" for example.
In this directory, copy:

protoc.exe
protobuf.lib protobuf.dll google --- a subdirectory if Protobuf

These are the proto compiler (protoc.exe under Windows), libraries, and the directory of header files to include.

4) Define a protocol

In this minimalist file, we add a single message set up with the text "Hello World!".

We define a text field that we call "data":

message hello
{
required string data;
}

We add a serial number:

message hello
{
required string data = 1;
}

We add a default value, which initiates the field, it is between square brackets:

[default="Hello World!"]

This gives the following definition:

message hello
{
required string data = 1 [default="Hello World!"];
}

The file is saved under the name hello.proto in our working directory.

5) Compile the proto file

A batch file named makehello.bat is created which contains the following statement:

protoc --cpp_out=. hello.proto

--cpp_out defines the destination directory, the dot indicates the current directory.
hello.proto is the definition file previously created.

The compilation will produce the following files:

hello.pb.cc
hello.pb.h

The protoc compiler has generated a C++ Class corresponding to our definition and the corresponding header file.
We have now to create a program in C++ to use this class.

6) Write the hello program

The example of use of the class will simply display the message contained in the file. Then we will modify the contents of the field by assigning another string.

int main()
{
hello demo; // instance "demo" of the "hello" class defined in proto

const std::string& test = demo.data(); // data is the field where the text was stored
cout << test << "\n";

demo.set_data("How do you do?"); // changing the data
cout << demo.data() << "\n";
return 0;
}

The demo.data() method provides access to the contents of the data field of our proto definition. This interface is generated by the Protoc compiler.
It also generates the method demo.set_data () which allows you to change the contents of the field.

Full source code in the archive.

7) Compile the hello program

The batch file will be completed by two lines. The first to compile the file generated by Protoc which produces hello.pb.obj, the second to compile our own script:

protoc --cpp_out=. hello.proto

cl hello.pb.cc /EHsc -I. -c

cl hello.cpp  /EHsc -I. hello.pb.obj libprotobuf.lib
  

In the the second compile command are included the file hello.pb.obj and the interface protobuf.lib (which provides access to the content of protobuf.dll).

Makedemo.bat creates the hello.exe binary executable. It should display:

Hello World!
How do you do?

8) Save data from class

Thanks to methods inherited from the Message superclass, the content changed in our class can be stored permanently and retrieved in future uses of the proto file by the same program or another.

To do this you can use the C++ fstream function in combination with the inherited method SerializeToOstream.

We define a new simplistic proto file:

message hello
{
  required int32 data = 1 [default=10];
}

The body of the function becomes:

fstream input("hellowrite.proto", ios::in | ios::binary);

cout << demo.data() << "\n";
  
demo.set_data(demo.data() + 10);     // adding a value
  
fstream output("hellowrite.proto", ios::out | ios::trunc | ios::binary);
demo.SerializeToOstream(&output);  
  

Every time we launch the program, the current number is incremented by 10 that proves that the content of data is saved to disk. In fact a new proto binary file is created and the new value stored.

Full source code in the archive.

9) Using a server

The compiler also creates interfaces between the server and clients, providing we define a service:

service hello 
{    
    rpc data(HelloRequest) returns(HelloResponse);  
}  

The process of creating a service is described in the Protocol Buffers API.

data_Stub(RpcChannel* channel)

The instruction creates an access to the channel to which send a message.
This is detailed in the tutorial of Google.

Further steps

This is an introduction to the use of the Protocol Buffer language. For a detailed explanation of the format, go to Google:

See also the header file generated by Protoc, it lists the available methods.

The examples of this tutorial are minimized to be useful to understanding, they should not be used as is in production. In applications many controls must to be added.

Download the examples: