This interface allows kdb users to parse data which has been encoded using Google Protocol Buffers (protobuf) into kdb according to the proto schema and serialise it back to the encoded wire format. The interface utilises the libprotobuf
descriptor and reflection C APIs.
This is part of the Fusion for kdb interface collection.
Kdb is the world’s fastest timeseries database, optimized for ingesting, analyzing and storing massive amounts of structured data. To get started with kdb , visit https://code.kx.com/q/learn/ for downloads and developer information. For general information, visit https://kx.com/
Protocol Buffers (Protobuf) is a language-neutral, platform-neutral, extensible mechanism for serializing structured data. It is used both in the development of programs which are required to communicate over the wire or for data storage. Developed originally for internal use by Google, it is released under an open source BSD license. The core principle behind this data format is to allow a user to define the expected structure of their data and incorporate this within specially generated source code. This allows a user to easily read and write structured data to and from a variety of languages.
Protobuf messages are defined in a .proto
schema file and these message definitions must be imported into the interface in order for it to be able to create messages of those types. The interface supports two ways to do this (or a combination of both) but the method used will impact how protobufkdb should be installed.
Normally the Protocol Buffers compiler is used to generate source code from the .proto
schema files which is then compiled in to the binary:
protoc
compiles your message definitions based on a file defined as:
<schema>.proto
producing both a C source and header file defined as:
<schema>.pb.cc
<schema>.pb.h
These files contain the classes and metadata which describe the schema and the functionality required to serialize to and parse from this schema.
This mechanism is more performant but does require that protobufkdb be built from source since the binary needs to be rebuilt to change the statically available messages.
To provide greater flexibility and usability it is also possible to dynamically import a .proto
schema file at runtime from within the q session. Imported message definitions can then be used subsequently by the interface and behave similarly to compiled in ones (the import procedure leverages the same functionality as used by the protobuf compiler).
If only dynamically imported message definitions are required then the packaged installation of protobufkdb can be used. However, importing message definitions is less performant - in addition to the one-off import cost, there is also an overhead from the subsequent use of these dynamically created messages (approx. 10% for parsing, 20% for serializing). Alternatively a hybrid approach can be employed where dynamic messages are used during development until the schemas are finalized, at which point they are compiled into the interface.
The protobufkdb releases are linked statically against libprotobuf to avoid potential C ABI compatibility issues with different versions of libprotobuf. Therefore it is unnecessary to install protobuf separately when used a packaged release.
-
Download a release from here
-
Install required q executable script
q/protobufkdb.q
and binary filelib/protobufkdb.(so|dll)
to$QHOME
and$QHOME/[mlw](64)
, by executing the following from the Release directory## Linux/MacOS chmod x install.sh && ./install.sh ## Windows install.bat
-
To use the KdbTypeSpecifier field option (described below) with dynamic messages then the directory containing
kdb_type_specifier.proto
must be specified to the interface as an import search location. In the release packagekdb_type_specifier.proto
(and its dependencies) are found in theproto
subdirectory. Import paths can be relative or absolute. For example, if the q session is started from the root of the release package run:.protobufkdb.addProtoImportPath["proto"]
Protobufkdb requires the full Protocol Buffers runtime (protoc compiler, libprotobuf and its header files) to be installed on your system. Many packaged installations only contain a subset of the required functionality or use an incompatible build. Furthermore, version mismatches can occur between protoc and libprotobuf if a new installation is applied on top of an existing one.
It is therefore recommend that the protocol buffer runtime is built from source and installed to a non-system directory. This directory can then be specified to the protobufkdb build so it will use that Protocol Buffers installation in preference to any existing system installs.
The tools required to build Protocol Buffers from source on Linux/macOS are described here.
However, do not build Protocol Buffers using Google's configure
script, since that will create a debug version of libprotobuf.a
which protobufkdb links against. Rather, follow the instructions below to build Protocol Buffers using CMake with the correct compiler flags and install it to a non-system directory.
Clone the Protocol Buffers source from GitHub:
git clone https://github.com/protocolbuffers/protobuf.git
cd protobuf
Create an install directory and set an environment variable to this directory (this is used again later when building protobufkdb):
mkdir install
export PROTOBUF_INSTALL=$(pwd)/install
Create the CMake build directory and generate the build files, specifying position independent code (otherwise symbol relocation errors will occur during linking of protobufkdb):
mkdir cmake/build
cd cmake/build
cmake -DCMAKE_BUILD_TYPE=Release -Dprotobuf_BUILD_TESTS=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=ON -DCMAKE_INSTALL_PREFIX=$PROTOBUF_INSTALL ..
Finally build and install Protocol Buffers:
cmake --build . --config Release
cmake --build . --config Release --target install
The tools required to build Protocol Buffers from source on Windows are described here and details on how to setup your environment to build with VS2019 are here. Then follow the below instructions to build a Release version Protocol Buffers and install it to a non-system directory.
From a Visual Studio command prompt, clone the Protocol Buffers source from github:
C:\Git> git clone https://github.com/protocolbuffers/protobuf.git
C:\Git> cd protobuf
Create an install directory and set an environment variable to this directory (substituting the correct absolute path as appropriate). This environment variable is used again later when building protobufkdb:
C:\Git\protobuf> mkdir install
C:\Git\protobuf> set PROTOBUF_INSTALL=C:\Git\protobuf\install
Create the CMake build directory (note that if you also wish to build a Debug version of Protocol Buffers then a second CMake build directory is required):
C:\Git\protobuf> mkdir cmake\release_build
C:\Git\protobuf> cd cmake\release_build
Generate the build files (this will default to using the Visual Studio CMake generator when run from a VS command prompt):
C:\Git\protobuf\cmake\release_build> cmake -Dprotobuf_BUILD_TESTS=OFF -DCMAKE_INSTALL_PREFIX=%PROTOBUF_INSTALL% ..
Finally build and install Protocol Buffers:
C:\Git\protobuf\cmake\release_build> cmake --build . --config Release
C:\Git\protobuf\cmake\release_build> cmake --build . --config Release --target install
Protobufkdb uses a factory to create a message class object of the correct type from the message type string passed from kdb. The lookup requires that the message type string passed from kdb is the same as the message name in its .proto definition.
In order to populate the factory, the .proto files for all messages to be serialised/parsed must be incorporated into the build as follows:
-
Place the new
<schema>.proto
file into thesrc/
subdirectory -
Edit
src/CMakeLists.txt
file, adding the new .proto file to the line below the following comment:# ### GENERATE PROTO FILES ###
For example, to add
examples.proto
(which is already present in thesrc/
subdirectory), in addition to the existingtests.proto
, change:set(MY_PROTO_FILES tests.proto)
to:
set(MY_PROTO_FILES tests.proto examples.proto)
Note: MY_PROTO_FILES
is a CMake-space separated list; do not wrap the list of .proto
files in a string.
A CMake script is provided to build protobufkdb. This uses the CMake functionality to locate the protobuf installation on your system. By setting the CMake environment variable CMAKE_PREFIX_PATH
to the Protocol Buffers installation directory created above when building protobuf from source, CMake will use this installation in preference to any existing system installs. This avoids issues with existing incompatible or mismatched protobuf installs.
From the root of this repository create and move into a directory in which to perform the build:
mkdir build && cd build
Generate the build scripts, specifying the protobuf buffers installation created above when building protobuf from source (referenced by the environment variable $PROTOBUF_INSTALL
which should have been set during that procedure):
## Linux/MacOS
cmake -DCMAKE_PREFIX_PATH=$PROTOBUF_INSTALL ..
## Windows
cmake -DCMAKE_PREFIX_PATH=%PROTOBUF_INSTALL% ..
Start the build:
cmake --build . --config Release
Create the install package and deploy:
cmake --build . --config Release --target install
Note: By default src/CMakeLists.txt
is configured to link statically against libprotobuf to avoid potential C ABI compatibility issues with different versions of libprotobuf. This is a particular issue on Windows.
Because the protobufkdb interface uses both the protoc compiler and the Protocol Buffers’ runtime, the versions of protoc, libprotobuf and its header files must be consistent and installed from the same build. Otherwise build errors can occur when compiling any of the proto-generated .pb.h
or .pb.cc
files. To help identify these problems the protobufkdb CMake scripts log the locations of the Protocol Buffers installation it has found. For example:
[build]$ cmake ..
-- The CXX compiler identification is GNU 4.8.5
-- Check for working CXX compiler: /usr/bin/c
-- Check for working CXX compiler: /usr/bin/c - works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Generator : Unix Makefiles
-- Build Tool : /usr/bin/gmake
-- Proto files: tests.proto;examples.proto
-- [ /usr/share/cmake3/Modules/FindProtobuf.cmake:321 ] Protobuf_USE_STATIC_LIBS = ON
-- [ /usr/share/cmake3/Modules/FindProtobuf.cmake:455 ] requested version of Google Protobuf is
-- [ /usr/share/cmake3/Modules/FindProtobuf.cmake:463 ] location of common.h: /usr/local/include/google/protobuf/stubs/common.h
-- [ /usr/share/cmake3/Modules/FindProtobuf.cmake:481 ] /usr/local/include/google/protobuf/stubs/common.h reveals protobuf 3.7.1
-- [ /usr/share/cmake3/Modules/FindProtobuf.cmake:495 ] /home/protobuf/install/bin/protoc reveals version 3.11.4
-- Found Protobuf: /usr/local/lib/libprotobuf.a;-lpthread (found version "3.7.1")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/protobufkdb/build
indicates it found protoc version 3.11.4 at /home/protobuf/install/bin/protoc
but version 3.7.1 of libprotobuf.a
(and the headers) installed on the system under /usr/local/
. This can occur if there was a conflicting packaged version of protobuf already on the system and will likely cause the protobufkdb build to fail.
The solution, as described above, is to build the Protocol Buffers runtime from source, install it to non-system directory then specify that directory when building protobufkdb.
A sample Docker file is provided in the docker_linux
directory to create a Ubuntu 18.04 LTS environment (including downloading and building the Protocol Buffers runtime from source) before building and installing the kdb protobufkdb
interface.
For Docker Windows, the PROTOBUFKDB_SOURCE
and QHOME_LINUX
directories are specified at the top of protobufkdb_build.bat
, which sets up the environment specified in Dockerfile.build
and invokes protobufkdb_build.sh
to build the interface.
The protobufkdb interface is provided here as a beta release under an Apache 2.0 license.
If you find issues with the interface or have feature requests, please raise an issue.
To contribute to this project, please follow the contribution guide.
Protocol Buffers is used under the terms of Google’s license:
Copyright 2008 Google Inc. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the
distribution.
* Neither the name of Google Inc. nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Code generated by the Protocol Buffer compiler is owned by the owner
of the input file used when generating it. This code is not
standalone and requires a support library to be linked with it. This
support library is itself covered by the above license.