PyBGPStream Tutorial

PyBGPStream provides a Python interface to the libBGPStream C library.

Below we provide the following tutorials:


Get familiar with the API

As a first example, we use pybgpstream to output the information extracted from BGP records and BGP elems. We provide a step by step description and the link to download the script at the end of the section. The example is fully functioning and it can be run using the following command:

$ python pybgpstream-print.py
ris rrc11 update 1438417216 valid A 2001:504:1::a500:9002:1 9002 {'next-hop': '2001:504:1::a500:9002:1', 'prefix': '2001:67c:2c44::/48', 'as-path': '9002 21219 50581 49588'}
ris rrc11 update 1438417216 valid A 198.32.160.182 9002 {'next-hop': '198.32.160.182', 'prefix': '91.214.14.0/24', 'as-path': '9002 6663 39668 65535 49256'}
ris rrc11 update 1438417216 valid A 2001:504:1::a500:9002:1 9002 {'next-hop': '2001:504:1::a500:9002:1', 'prefix': '2001:67c:2c44::/48', 'as-path': '9002 13249 49588'}
ris rrc11 update 1438417216 valid A 198.32.160.182 9002 {'next-hop': '198.32.160.182', 'prefix': '61.7.155.0/24', 'as-path': '9002 2914 3356 4651 131090 131090 131090'}
ris rrc11 update 1438417216 valid A 198.32.160.103 13030 {'next-hop': '198.32.160.103', 'prefix': '41.221.20.0/24', 'as-path': '13030 30781 57023 57023 57023 57023 36947'}
ris rrc11 update 1438417216 valid A 198.32.160.182 9002 {'next-hop': '198.32.160.182', 'prefix': '41.221.20.0/24', 'as-path': '9002 3356 12956 36947'}
ris rrc11 update 1438417216 valid A 198.32.160.182 9002 {'next-hop': '198.32.160.182', 'prefix': '61.7.155.0/24', 'as-path': '9002 3356 4651 131090 131090 131090'}
ris rrc11 update 1438417216 valid W 198.32.160.42 2497 {'prefix': '103.224.214.0/24'}
ris rrc11 update 1438417216 valid W 198.32.160.42 2497 {'prefix': '103.224.215.0/24'}
ris rrc11 update 1438417216 valid A 2001:504:1::a500:9002:1 9002 {'next-hop': '2001:504:1::a500:9002:1', 'prefix': '2001:67c:2c44::/48', 'as-path': '9002 13249 50581 49588'}
ris rrc11 update 1438417216 valid A 198.32.160.182 9002 {'next-hop': '198.32.160.182', 'prefix': '61.7.155.0/24', 'as-path': '9002 6453 4651 131090 131090 131090'}
ris rrc11 update 1438417216 valid A 2001:504:1::a500:9002:1 9002 {'next-hop': '2001:504:1::a500:9002:1', 'prefix': '2001:67c:2c44::/48', 'as-path': '9002 13249 49588'}
ris rrc11 update 1438417216 valid A 198.32.160.182 9002 {'next-hop':
'198.32.160.182', 'prefix': '61.7.155.0/24', 'as-path': '9002 3356 4651 131090 131090 131090'}


Step by step description

The first step in each pybgpstream script is to import the Python modules and create a new BGPStream instance as well as a re-usable BGP record instance.

from _pybgpstream import BGPStream, BGPRecord, BGPElem

stream = BGPStream()
rec = BGPRecord()


The second step consists in customizing the stream using the project, collector, type, and time interval filters. The time interval filter is mandatory, whereas the others are optional. In this specific case, we are configuring the stream to return the BGP records read from an Updates dump generated by RIS RRC 11 collector, having a timestamp equal to Sat, 01 Aug 2015 08:20:16 GMT.

stream.add_filter('collector','rrc11')
stream.add_interval_filter(1438417216,1438417216)


At this point we can start the stream, and repeatedly ask for new BGP records and BGP elems. Each time a valid record is read, we extract from it the elems that it contains and print the record and elem fields. If a non-valid record is found, we do not attempt to extract elems.

while(stream.get_next_record(rec)):
    if rec.status != "valid":
        print rec.project, rec.collector, rec.type, rec.time, rec.status
    else:
        elem = rec.get_next_elem()
        while(elem):
            print rec.project, rec.collector, rec.type, rec.time, rec.status,
            print elem.type, elem.peer_address, elem.peer_asn, elem.fields
            elem = rec.get_next_elem()


Complete Example

Get the code.

#!/usr/bin/env python

from _pybgpstream import BGPStream, BGPRecord, BGPElem

# Create a new bgpstream instance and a reusable bgprecord instance
stream = BGPStream()
rec = BGPRecord()

# Consider RIPE RRC 10 only
stream.add_filter('collector','rrc11')

# Consider this time interval:
# Sat Aug  1 08:20:11 UTC 2015
stream.add_interval_filter(1438417216,1438417216)

# Start the stream
stream.start()

# Get next record
while(stream.get_next_record(rec)):
    # Print the record information only if it is not a valid record
    if rec.status != "valid":
        print rec.project, rec.collector, rec.type, rec.time, rec.status
    else:
        elem = rec.get_next_elem()
        while(elem):
            # Print record and elem information
            print rec.project, rec.collector, rec.type, rec.time, rec.status,
            print elem.type, elem.peer_address, elem.peer_asn, elem.fields
            elem = rec.get_next_elem()




Print the MOAS prefixes

In this second tutorial we show how to use pybgpstream to output the MOAS prefixes and their origin ASes. The example is fully functioning and it can be run using the following command:

$ python pybgpstream-moas.py
   194.68.55.0/24 43893,30893
   199.45.53.0/24 701,65403
   207.188.170.0/24 13332,26640
   ...
   193.232.141.0/24 42385,31649
   84.16.224.0/19 28753,16265
   125.208.1.0/24 4808,4847


The program parses the BGP elems extracted from the BGP records that match the filters (collectors, record type, and time), saves in a hash map the list of unique origin ASns for each prefix, and outputs those that have multiple origin ASns.


Step by step description

In this case the stream is configured to return the BGP records read from a RIBs dump generated by the Route View Singapore collector, having a timestamp in the interval 7:50:00 - 08:10:00 Sat, 01 Aug 2015 GMT.

stream.add_filter('collector','route-views.sg')
stream.add_filter('record-type','ribs')
stream.add_interval_filter(1438415400,1438416600)


We use a dictionary to associate a list of origin ASns with each prefix observed in the RIB dump.

from collections import defaultdict

prefix_origin = defaultdict(set)


Each time a new BGP elem is extracted, the program extracts the prefix and the origin ASn and updates the prefix_origin dictionary. Prefix and AS-path are string fields that are present in any BGP elem of type RIB. The split function converts the AS path string into an array of strings, each one representing an AS hop, the last hop is the origin AS.

pfx = elem.fields['prefix']
ases = elem.fields['as-path'].split(" ")
if len(ases) > 0:
 origin = ases[-1]
 prefix_origin[pfx].add(origin)


Complete Example

Get the code.

#!/usr/bin/env python

from _pybgpstream import BGPStream, BGPRecord, BGPElem
from collections import defaultdict


# Create a new bgpstream instance and a reusable bgprecord instance
stream = BGPStream()
rec = BGPRecord()

# Consider Route Views Singapore only
stream.add_filter('collector','route-views.sg')

# Consider RIBs dumps only
stream.add_filter('record-type','ribs')

# Consider this time interval:
# Sat, 01 Aug 2015 7:50:00 GMT -  08:10:00 GMT
stream.add_interval_filter(1438415400,1438416600)

# Start the stream
stream.start()

# <prefix, origin-ASns-set > dictionary
prefix_origin = defaultdict(set)

# Get next record
while(stream.get_next_record(rec)):
    elem = rec.get_next_elem()
    while(elem):
        # Get the prefix
        pfx = elem.fields['prefix']
        # Get the list of ASes in the AS path
        ases = elem.fields['as-path'].split(" ")
        if len(ases) > 0:
            # Get the origin ASn (rightmost)
            origin = ases[-1]
            # Insert the origin ASn in the set of
            # origins for the prefix
            prefix_origin[pfx].add(origin)
        elem = rec.get_next_elem()

# Print the list of MOAS prefix and their origin ASns
for pfx in prefix_origin:
    if len(prefix_origin[pfx]) > 1:
        print pfx, ",".join(prefix_origin[pfx])



Measuring the extent of AS path inflation

In this example, we show how to use pybgpstream to measure the extent of AS path inflation, i.e. measure how many AS paths are longer than the shortest path between two ASes due to the adoption of routing policies. The example is fully functioning and it can be run using the following command:

$ python pybgpstream-moas.py
   ...
   3549 27316 6 5
   3549 27314 3 3
   3549 27313 3 3
   3549 27310 3 3
   3549 27311 3 3
   3549 45834 4 4
   3549 27318 4 3
   3549 27319 5 4
   3549 18173 4 4
...


The program reads a RIB dump as originated by the RIS RRC00 collector, it computes the number of AS hops between the peer ASn and the origin AS, and it compares it to the shortest path between the same AS pairs in an simple undirected graph built using the AS path adjacencies. The output complies with the following format:

<monitor ASn> <destination ASn> <#AS hops in BGP> <#AS hops in undirected graph>


Step by step description

In this case the stream is configured to return the BGP records read from a RIBs dump generated by RIS RRC00 collector, having a timestamp in the interval 7:50:00 - 08:10:00 Sat, 01 Aug 2015 GMT.

stream.add_filter('collector','rrc00')
stream.add_filter('record-type','ribs')
stream.add_interval_filter(1438415400,1438416600)


The script uses the NetworkX package utilities to generate a simple undirected graph (i.e. a graph that does not have loops or self-edges). A dictionary of dictionaries is used to maintain the shortest path between the peer ASn and the origin ASn as observed in BGP.

import networkx as nx
from collections import defaultdict

as_graph = nx.Graph()

bgp_lens = defaultdict(lambda: defaultdict(lambda: None))


Each time a new BGP elem is extracted, the program removes the ASns that are repeatedly prepended in the AS path (using the groupby function), counts the number of AS hops between the peer and the destination AS (i.e. the origin AS), and saves this information in the bgp_lens dictionary. Each adjacency in the reduced AS path is used to add a new link to the NetworkX graph.

hops = [k for k, g in groupby(elem.fields['as-path'].split(" "))]
if len(hops) > 1 and hops[0] == peer:
            origin = hops[-1]
            for i in range(0,len(hops)-1):
                as_graph.add_edge(hops[i],hops[i+1])
            bgp_lens[peer][origin] = min(filter(bool,[bgp_lens[peer][origin],len(hops)]))


Finally, for each peer and origin pair, the script uses the NetworkX utility functionst to compute the length of the shortest path between the two nodes in the simple undirected graph. The output juxtaposes the minimum lenght observed in BGP and the shortest path computed in the simple undirected graph.

for peer in bgp_lens:
    for origin in bgp_lens[peer]:
       nxlen = len(nx.shortest_path(as_graph, peer, origin))
        print peer, origin, bgp_lens[peer][origin], nxlen


Complete Example

Get the code.

from _pybgpstream import BGPStream, BGPRecord, BGPElem
from collections import defaultdict
from itertools import groupby
import networkx as nx

# Create a new bgpstream instance and a reusable bgprecord instance
stream = BGPStream()
rec = BGPRecord()

# Create an instance of a simple undirected graph
as_graph = nx.Graph()

bgp_lens = defaultdict(lambda: defaultdict(lambda: None))

# Consider RIS RRC 00 only
stream.add_filter('collector','rrc00')

# Consider RIBs dumps only
stream.add_filter('record-type','ribs')

# Consider this time interval:
# Sat, 01 Aug 2015 7:50:00 GMT -  08:10:00 GMT
stream.add_interval_filter(1438415400,1438416600)

stream.start()

while(stream.get_next_record(rec)):
    elem = rec.get_next_elem()
    while(elem):
        # Get the peer ASn
        peer = str(elem.peer_asn)
        # Get the array of ASns in the AS path and remove repeatedly prepended ASns
        hops = [k for k, g in groupby(elem.fields['as-path'].split(" "))]
        if len(hops) > 1 and hops[0] == peer:
            # Get the origin ASn
            origin = hops[-1]
            # Add new edges to the NetworkX graph
            for i in range(0,len(hops)-1):
                as_graph.add_edge(hops[i],hops[i+1])
            # Update the AS path length between 'peer' and 'origin'
            bgp_lens[peer][origin] = \
                min(filter(bool,[bgp_lens[peer][origin],len(hops)]))
        elem = rec.get_next_elem()

# For each 'peer' and 'origin' pair
for peer in bgp_lens:
    for origin in bgp_lens[peer]:
        # compute the shortest path in the NetworkX graph
        nxlen = len(nx.shortest_path(as_graph, peer, origin))
        # and compare it to the BGP hop length
        print peer, origin, bgp_lens[peer][origin], nxlen



Studying the communities

In this example, we show how to use pybgpstream to extract information the prefixes that are associated with a specific type of communities. Specifically we use the bgpstream filtering options to select a specific set of prefixes of interest, as well as a specific peer ASn, and any message having at least one community with 3400 as value. The example is fully functioning and it can be run using the following command:

$ python pybgpstream-communities.py
Community: 2914:3400 ==> 185.84.167.0/24,185.84.166.0/24,185.84.166.0/23
Community: 2914:2406 ==> 185.84.167.0/24,185.84.166.0/24,185.84.166.0/23
Community: 2914:410 ==> 185.84.167.0/24,185.84.166.0/24,185.84.166.0/23
Community: 2914:3475 ==> 185.84.167.0/24,185.84.166.0/24,185.84.166.0/23
Community: 2914:1405 ==> 185.84.167.0/24,185.84.166.0/24,185.84.166.0/23


The program reads a RIB dump as originated by the RIS RRC06 collector, it selects messages originated by the 25152 peer that are associated with 185.84.166.0/23 (or more specifics), and have at least one community that has 3400 as value. The output complies with the following format:

Community: <ASn>:<value> ==> <prefixes affected by the community>


Step by step description

In this case the stream is configured to return the BGP records read from a RIBs dump generated by RIS RRC06 collector, having a timestamp in the interval 7:50:00 - 08:10:00 Sat, 01 Aug 2015 GMT. The elems are filtered considering three conditions: the originating peer AS number, the prefix announced, and the presence of at least one community having 3400 as value.

stream.add_filter('collector','rrc06')
stream.add_filter('record-type','ribs')
stream.add_interval_filter(1438415400,1438416600)
stream.add_filter('peer-asn','25152')
stream.add_filter('prefix','185.84.166.0/23')
stream.add_filter('community','*:3400')


A dictionary of sets maintains the list of prefixes affected by a specific community.

from collections import defaultdict

community_prefix = defaultdict(set)


Each time a new BGP elem is extracted, the program build a string with the ASn and value fields of each community, and add the prefix to the set.

# Get the prefix
pfx = elem.fields['prefix']
# Get the associated communities
communities = elem.fields['communities']
# for each community save the set of prefixes that are affected
for c in communities:
    ct = str(c["asn"]) + ":" + str(c["value"])
    community_prefix[ct].add(pfx)


Finally, the dictionary is outputed to standard output.

for ct in community_prefix:
    print "Community:", ct, "==>", ",".join(community_prefix[ct])


Complete Example

Get the code.

#!/usr/bin/env python

from _pybgpstream import BGPStream, BGPRecord, BGPElem
from collections import defaultdict


# Create a new bgpstream instance and a reusable bgprecord instance
stream = BGPStream()
rec = BGPRecord()

# Consider RRC06 only
stream.add_filter('collector','rrc06')

# Consider RIBs dumps only
stream.add_filter('record-type','ribs')

# Consider messages from the 25152 peer only
stream.add_filter('peer-asn','25152')

# Consider entries associated with 185.84.166.0/23 and more specifics
stream.add_filter('prefix','185.84.166.0/23')

# Consider entries having a community attribute with value 3400
stream.add_filter('community','*:3400')

# Consider this time interval:
# Sat, 01 Aug 2015 7:50:00 GMT -  08:10:00 GMT
stream.add_interval_filter(1438415400,1438416600)

# Start the stream
stream.start()

# <community, prefix > dictionary
community_prefix = defaultdict(set)

# Get next record
while(stream.get_next_record(rec)):
    elem = rec.get_next_elem()
    while(elem):        
        # Get the prefix
        pfx = elem.fields['prefix']
        # Get the associated communities
        communities = elem.fields['communities']
        # for each community save the set of prefixes
        # that are affected
        for c in communities:
            ct = str(c["asn"]) + ":" + str(c["value"])
            community_prefix[ct].add(pfx)
        elem = rec.get_next_elem()

# Print the list of MOAS prefix and their origin ASns
for ct in community_prefix:
    print "Community:", ct, "==>", ",".join(community_prefix[ct])