Protocol Buffers - Data exchange
Protocol Buffers is a data definition language created by Google that can be compared to IDL, but is much simpler. Its syntax, based on the C language, evokes that of JSON, with the difference of the use of typed variables.
Google has defined this
language for use on its own servers that store and exchange big quantities
of structured data, and in 2008 decided to make it open source.
The proto files have a dual format, the human readable source and the binary that can be handled quickly by the machine. It is an alternative to XML, much more compact, with a processing time considerably decreased.
Glossary
Protocol Buffers: name of the language and name of units of data encapsulated into a proto file.
Proto: a data definition file in the PB language, with the .proto extension.
Protoc: name of the compiler that produces classes or binaries.
Features
- Object oriented language, each message inherits the Message class.
- Typed data language.
- Textual and binary formats.
- The Protoc compiler generates, from the data definition, a class in the choosen langage.
- The compiler provides C++ or Java classes and is intended to be compatible with all languages.
- The class is serialized into a binary file. Protoc can also produce the binary file from the PB language.
- A unit is called "message". A .proto file can contain several messages.
- Supports namespaces.
- The structure of a message is recursive, a proto structure may have elements that are other proto structures.
- repeated fields, as in XML, their definition can be reused in the same message.
- Dynamically extensible.
Why use Protocol Buffers?
It is a
means of storing structured data and exchange them between software,
possibly written in different programming languages, and between a
server and a client. A library of functions is included to assist in the use of Web services.
It allows a cross-langage serialization of classes.
The serialization produces compact and easy to process binary code.
Syntax
Each source has the form:
message name {
...list of data fields...
}
Modifiers of variables are: required, optional, repeated.
A sequence number is assigned to each variable, which is a directive to the compiler and not a value for the variable.
An initial value may be assigned with the default directive:
required string x = 1 [default="Some text"];
The main scalar types are string, int32, int64, float, double, bool.
In addition nested types are added by defining a message into another message:
message container
{
required int32 number = 1;
message contained
{
repeated string x = 1;
}
}
Contained can be accessed through a string as container.contained.x
Enumerations enum can be included in messages.
When we defined the structure of a message, it is used in a program by creating an instance. To it are associated methods specific to the class and produced by the compiler int C++ or Java generated files.
container myinstance; myinstance.set_number(18);
Sites and tools
- Download the Protoc compiler. And access to the full documentation.
- Protocol Buffers, a tutorial for a first use.
- Definition of the PB programming language.
Sample code
Simple message.
message hello
{
required string = 1 [default="the message"];
optional int32 = 2;
}
(c) 2008-2009 Scriptol.com