Over the last several years, we have been working on a new release of BGPStream. This version contains several new features including native support for true real-time data access via BMP, as well as a bunch of bug fixes, and performance improvements. See the Changes section below for a more detailed list.

Since there have been significant changes made to the core of BGPStream, we are making this pre-release snapshot of v2 available to selected users for testing and evaluation before making a full public beta release.

See the Live BMP Stream section below for help accessing the new realtime BMP stream.

Feedback

We appreciate your support of this project, and your willingness to help us test the next version. Please create an issue on either the libBGPStream or PyBGPStream GitHub repos detailing any problems you have, or alternatively, contact bgpstream-info@caida.org.

Changes

New Features

  • Native BMP support
    • libBGPStream now supports processing raw BMP data in the same way as MRT.
    • Currently the "singlefile" and the new "kafka" and "beta-bmp-stream" data interfaces provide access to BMP data. The "singlefile" interface can be used to process local dumps of raw BMP data whereas the "kafka" interface can be used to access BMP (or MRT) data from a Kafka cluster.
  • Realtime data stream via RIS Live
    • libBGPStream now supports obtaining realtime BGP data from RIPE's RIS Live BGP message stream .
    • Access to RIS Live stream is supported using the "ris-live" data interface.
    • To start streaming using bgpreader: bgpreader -d ris-live
  • Realtime data stream via OpenBMP
    • libBGPStream now supports obtaining realtime BGP data from a Kafka cluster.
    • Access to private OpenBMP feeds is supported using the "kafka" data interface, whereas the "beta-bmp-stream" data interface may be used to access the public BMP stream provided as part of the BGPStream project.
    • See the Public BMP Stream and Private BMP Stream tutorials below for more information.
  • Local caching of dump files (optional)
    • Data files processed by the broker can now be cached to a local directory which is checked before downloading a dump file.
    • Previously, when using BGPStream to repeatedly process the same data (e.g., when testing/debugging code), poor network connectivity could add overhead to processing time.
    • The caching implementation is thread safe and can support parallel instances of BGPStream (either as threads or separate processes).
    • The cache can be enabled by setting the cache-dir parameter of the "broker" data interface. E.g., by passing -o cache-dir=/path/to/cache to bgpreader, or by calling stream.set_data_interface_option("broker", "cache-dir", "/path/to/cache") from PyBGPStream.
    • Thanks to Mingwei Zhang for contributing this feature.
  • New high-level PyBGPStream API (prototype)
    • There is now a high-level "Pythonic" API for accessing BGP data using BGPStream.
    • Previously the only Python interface was _pybgpstream, a low-level, almost exact bindings to the libBGPStream C API.
    • See the API docs below for more information.
  • New filter interface
    • libBGPStream now supports a "BPF-like" syntax for specifying filters.
    • E.g., collector route-views.sg and type ribs and prefix exact 192.172.226.0/24 would extract only RIB records matching the given prefix from the route-views.sg collector.
    • For BGPReader, this feature may be accessed using the -f option.
    • For PyBGPStream, filters can be specified either using the filter parameter to the BGPStream constructor, or by calling the parse_filter_string method on an existing stream.
    • While the previous APIs for specifying filters are still available, this interface should be used as new filter types will only be available through this interface, and eventually the other filter methods will be deprecated.
    • See the temporary documentation for more information.
    • Thanks to Shane Alcock for contributing this feature.

Bug Fixes

  • Patricia Tree Fixes
    • We have made several fixes to the patricia trie implementation used by libBGPStream.
    • For most users this only affected prefix filtering. Previously all prefix filters were treated as "ANY" filters (rather than MORE/EXACT/LESS).

Performance Improvements

  • New MRT parser
    • Along with developing a BMP parser, we also developed a new MRT parser, libParseBGP to replace the fork of libbgpdump we used in V1.
    • libParseBGP should be much faster than libbgpdump (~30% faster in some tests we ran) and is designed to be easier to maintain and extend.
    • It is possible/likely that there are some MRT peculiarities that are not correctly handled by libParseBGP (yet). If you come across any problems, please contact us.
  • Completely re-designed resource management
    • To support addition of other data formats (e.g., BMP), and data transport mechanisms (e.g., Kafka), we completely re-designed the core resource management components of libBGPStream.
    • In addition to simplifying adding support for new formats, the new implementation appears to perform ~10% better than V1.
  • Improved Record API
    • We changed the get_next_record API to return a borrowed pointer to an internal record structure, rather than filling a structure passed by the user.
    • This minimizes memory allocations and copying within libBGPStream, improving performance.

PyBGPStream Improvements

  • BGPElem fields attribute is now cached so that subsequent field accesses do not needlessly rebuild the entire fields diectionary.
  • By popular demand, communities in PyBGPStream are now returned as a set of "asn:value" strings

Misc. Improvements

  • Added bgpstream_pfx_copy method
  • Added bgpstream_as_path_get_origin_val method to extract origin ASN as simple integer. (Contributed by Samir Al-Sheikh.)
  • No longer require time interval to be set. This simplifies use of the "singlefile" data interface.
  • Added documentation of the AS Path string format.
  • BGPReader data interface options are now specified as -o <param>=<value> rather than -o <param>,<value>.

Downloads

libBGPStream

Source Tarball libbgpstream-2.0.0-rc2.tar.gz
Clone on GitHub libBGPStream

PyBGPStream

Source Tarball pybgpstream-2.0.0.tar.gz
Clone on GitHub PyBGPStream

Installing libBGPStream and PyBGPStream

The following instructions assume some familiarity with installing software from source and are specific to Ubuntu and macOS. For additional help, or help installing BGPStream on another OS, please contact bgpstream-info@caida.org.

Note: installing BGPStream V2 is very similar to installing v1, so you may find the detailed libBGPStream and PyBGPStream install guides useful.

Also, while the README for PyBGPStream suggests installing using pip, the v2 beta has not yet been uploaded to PyPI and must be installed from source.

Ubuntu Instructions

1. Ensure dependencies are installed:

sudo apt-get install build-essential zlib1g-dev libbz2-dev libcurl4-openssl-dev python-dev

2. Install wandio:

curl -LO https://research.wand.net.nz/software/wandio/wandio-4.2.1.tar.gz
tar zxf wandio-4.2.1.tar.gz
cd wandio-4.2.1/
./configure
make
sudo make install
sudo ldconfig

3. Install rdkafka:

curl -LO https://github.com/edenhill/librdkafka/archive/v0.11.1.tar.gz
tar zxf v0.11.1.tar.gz
cd librdkafka-0.11.1/
./configure
make
sudo make install
sudo ldconfig

4. Install libBGPStream:

curl -LO https://github.com/CAIDA/libbgpstream/releases/download/v2.0-rc2/libbgpstream-2.0.0-rc2.tar.gz
tar zxf libbgpstream-2.0.0-rc2.tar.gz
cd libbgpstream-2.0.0/
./configure
make
make check
sudo make install
sudo ldconfig

5. Install PyBGPStream:

curl -LO https://github.com/CAIDA/pybgpstream/releases/download/v2.0/pybgpstream-2.0.0.tar.gz
tar zxf pybgpstream-2.0.0.tar.gz
cd pybgpstream-2.0.0/
python setup.py build_ext
sudo python setup.py install

macOS Instructions

These instructions assume that you are using Homebrew to manage packages. If you are not using a package manager, or are using MacPorts, follow steps 2 and 3 of the Ubuntu install instructions instead of step 1 below (but do not execute the sudo ldconfig commands).

1. Install librdkafka:

brew install wandio librdkafka

2. Install wandio:

curl -LO https://research.wand.net.nz/software/wandio/wandio-4.2.1.tar.gz
tar zxf wandio-4.2.1.tar.gz
cd wandio-4.2.1/
./configure
make
sudo make install
sudo ldconfig

3. Install libBGPStream:

curl -LO https://github.com/CAIDA/libbgpstream/releases/download/v2.0-rc2/libbgpstream-2.0.0-rc2.tar.gz
tar zxf libbgpstream-2.0.0-rc2.tar.gz
cd libbgpstream-2.0.0/
./configure
make
make check
sudo make install

4. Install PyBGPStream:

curl -LO https://github.com/CAIDA/pybgpstream/releases/download/v2.0/pybgpstream-2.0.0.tar.gz
tar zxf pybgpstream-2.0.0-beta-3.tar.gz
cd pybgpstream-2.0.0/
python setup.py build_ext
sudo python setup.py install

Upgrading Code

There have been some changes made to the libBGPStream and PyBGPStream APIs. Below are some instructions for upgrading your code to work with BGPStream V2.

libBGPStream

There have been several changes made to the C API, but all changes will result in compiler errors, so it should be safe to use the compiler to direct you to code that needs changing. You should use the BGPReader code as a reference for using the new API, and please contact bgpstream-info@caida.org if you need further assistance.

PyBGPStream

We have added a prototype high-level Python module, pybgpstream that should be used instead of the low-level _pybgpstream module. Below is a short working example using this API. See the bundled sample template for more information. We are still developing documentation for this module, so for now, the best reference is the code. Please create an issue (or PR!) on GitHub if you have any suggestions for improving this interface.

import pybgpstream
stream = pybgpstream.BGPStream(
    from_time="2017-07-07 00:00:00", until_time="2017-07-07 00:10:00 UTC",
    collectors=["route-views.sg", "route-views.eqix"],
    record_type="updates",
    filter="peer 11666 and prefix more 210.180.0.0/16"
)

for elem in stream:
    # record fields can be accessed directly from elem
    # e.g. elem.time
    # or via elem.record
    # e.g. elem.record.time
    print elem

_pybgpstream

If you wish to continue using the _pybgpstream module, the API has changed slightly:

1. Change the import line from:

from _pybgpstream import BGPStream, BGPRecord, BGPElem

to:

import _pybgpstream

2. Change the line that creates an instance of BGPStream from something like:

stream = BGPStream()

to:

stream = _pybgpstream.BGPStream()

3. Delete the line that creates the BGPRecord instance:

rec = BGPRecord()

4. The "get_next_record" API has been changed to be the same as the "get_next_elem" API, so you will go from nested loops like this:

while(stream.get_next_record(rec)):
    elem = rec.get_next_elem()
    while(elem):
        # do something with the elem
        elem = rec.get_next_elem()

to this:

rec = stream.get_next_record()
while(rec):
    elem = rec.get_next_elem()
        while(elem):
        # do something with the elem
        elem = rec.get_next_elem()
    rec = stream.get_next_record()

BGPReader

To accommodate new fields in the underlying record and elem structures (e.g., those specific to BMP data like "router"), the record and elem ASCII formats have changed:

V1 Record Format (old)

<dump-type>|<dump-pos>|<project>|<collector>|<status>|<dump-time>

V2 Record Format (new)

<type>|<dump-pos>|<rec-ts-sec>.<rec-ts-usec>| \
  <project>|<collector>|<router>|<router-ip>|<status>|<dump-time>

V1 Elem Format (old)

<dump-type>|<elem-type>|<record-ts>| \
  <project>|<collector>|<peer-ASN>|<peer-IP>| \
  <prefix>|<next-hop-IP>|<AS-path>|<origin-AS>| \
  <communities>|<old-state>|<new-state>

V2 Elem Format (new)

<rec-type>|<elem-type>|<rec-ts-sec>.<rec-ts-usec>| \
  <project>|<collector>|<router>|<router-ip>|<peer-ASN>|<peer-IP>| \
  <prefix>|<next-hop-IP>|<AS-path>|<origin-AS>| \
  <communities>|<old-state>|<new-state>

Note that timestamps are now represented with sub-second precision. E.g. 1499385779 is now represented as 1499385779.000000

Public Live BMP Stream

As part of the BGPStream project, we have begun to provide a public BMP feed. Currently we are only providing data from a few Route Views and Cisco Research peers, but we expect additional peers to be added soon. (If you would like to contribute a feed, please contact us at bgpstream-info@caida.org.)

We are providing access to these feeds by way of a publicly-accessible, read-only, Kafka cluster (bmp.bgpstream.caida.org:9092) which contains raw BMP data encapsulated in a custom OpenBMP message header. (We plan to contribute the code we developed to generate these headers back to the upstream OpenBMP repository.)

Unfortunately this release does not support simultaneous processing of dump-based MRT data, and stream-based BMP data, but we plan to add this support in an upcoming beta release. For now you can work around this by using two stream instances.

Accessing the Stream

In this release, access to this live BMP feed is provided by way of the temporary "beta-bmp-stream" data interface. See below for examples of how to use this interface.

BGPStream will use a random Kafka "consumer group" unless the group data interface option is set for the "beta-bmp-stream" data interface. Setting the group option allows the stream to be effectively load-balanced over multiple instances of BGPStream. It also provides some amount of fault-tolerance since BGPStream will resume where it left off.

From BGPReader

Accessing the feed from BGPReader is as simple as choosing the "beta-bmp-stream" data interface:

bgpreader -d beta-bmp-stream

From PyBGPStream

import pybgpstream
for elem in pybgpstream.BGPStream(data_interface="beta-bmp-stream"):
    print elem

Private OpenBMP Collector

If you have a router that you would like to monitor using OpenBMP and BGPStream, you can use the dockerized OpenBMP deployment from the OpenBMP project, and then simply configure the "kafka" data interface of BGPStream to stream data from your collector.

1. Setting up the OpenBMP collector

You will first need to deploy and configure an OpenBMP collector and Kafka instance. The easiest way to do this is using the docker container provided by the OpenBMP project. See the OpenBMP documentation for a detailed tutorial. (You do not need to run any consumers to use OpenBMP with BGPStream.)

2. Configure router to send BMP data to the collector

See the OpenBMP documentation for some sample router configurations.

3. Install BGPStream

See the BGPStream Install instructions above for more information.

5. Configure BGPStream to stream data from your collector

Use the "kafka" data interface of BGPStream, and configure it to point to your OpenBMP Kafka instance.

For example, if you are using the BGPReader CLI:

bgpreader -d kafka \
  -o brokers=<docker_host>:9092 \
  -o topic=openbmp.bmp_raw

Or, if you are using the PyBGPStream Python API here is a minimal working example:

import pybgpstream
stream = pybgpstream.BGPStream(data_interface="kafka")
stream.set_data_interface_option("kafka", "brokers", "<docker_host>:9092")
stream.set_data_interface_option("kafka", "topic", "openbmp.bmp_raw")
for elem in stream:
    print elem