This document describes how to use the methods and classes provided by cyberpandas.

We’ll assume that the following imports have been performed.

In [1]: import ipaddress

In [2]: import pandas as pd

In [3]: from cyberpandas import IPArray, to_ipaddress


First, you’ll need some IP Address data. Much like pandas’ pandas.to_datetime(), cyberpandas provides to_ipaddress() for converting sequences of anything to a specialized array, IPArray in this case.

From Strings

to_ipaddress() can parse a sequence strings where each element represents an IP address.

In [4]: to_ipaddress([
   ...:     '',
   ...:     '2001:0db8:85a3:0000:0000:8a2e:0370:7334',
   ...: ])
Out[4]: IPArray(['', '2001:db8:85a3::8a2e:370:7334'])

You can also parse a container of bytes (Python 2 parlance).

In [5]: to_ipaddress([
   ...:     b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xc0\xa8\x01\x01',
   ...:     b' \x01\r\xb8\x85\xa3\x00\x00\x00\x00\x8a.\x03ps4',
   ...: ])
Out[5]: IPArray(['', '2001:db8:85a3::8a2e:370:7334'])

If you have a buffer / bytestring, see From Bytes.

From Integers

IP Addresses are just integers, and to_ipaddress() can parse a sequence of them.

In [6]: to_ipaddress([
   ...:    3232235777,
   ...:    42540766452641154071740215577757643572
   ...: ])
Out[6]: IPArray(['', '2001:db8:85a3::8a2e:370:7334'])

There’s also the IPArray.from_pyints() method that does the same thing.

From Bytes

If you have a correctly structured buffer of bytes or bytestring, you can directly construct an IPArray without any intermediate copies.

In [7]: stream = (b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xc0\xa8\x01'
   ...:           b'\x01 \x01\r\xb8\x85\xa3\x00\x00\x00\x00\x8a.\x03ps4')

In [8]: IPArray.from_bytes(stream)
Out[8]: IPArray(['', '2001:db8:85a3::8a2e:370:7334'])

stream is expected to be a sequence of bytes representing IP Addresses (note that it’s just a bytestring that’s be split across two lines for readability). Each IP Address should be 128 bits, left padded with 0s for IPv4 addresses. In particular, IPArray.to_bytes() produces such a sequence of bytes.

Pandas Integration

IPArray satisfies pandas extension array interface, which means that it can safely be stored inside pandas’ Series and DataFrame.

In [9]: values = to_ipaddress([
   ...:     0,
   ...:     3232235777,
   ...:     42540766452641154071740215577757643572
   ...: ])

In [10]: values
Out[10]: IPArray(['', '', '2001:db8:85a3::8a2e:370:7334'])

In [11]: ser = pd.Series(values)

In [12]: ser
2    2001:db8:85a3::8a2e:370:7334
dtype: ip

In [13]: df = pd.DataFrame({"addresses": values})

In [14]: df
2  2001:db8:85a3::8a2e:370:7334

Most pandas methods that make sense should work. The following section will call out points of interest.


If your selection returns a scalar, you get back an ipaddress.IPv4Address or ipaddress.IPv6Address.

In [15]: ser[0]
Out[15]: IPv4Address('')

In [16]: df.loc[2, 'addresses']
Out[16]: IPv6Address('2001:db8:85a3::8a2e:370:7334')

Missing Data

The address 0 ( is used to represent missing values.

In [17]: ser.isna()
0     True
1    False
2    False
dtype: bool

In [18]: ser.dropna()
2    2001:db8:85a3::8a2e:370:7334
dtype: ip

IP Accessor

cyberpandas offers an accessor for IP-specific methods.

In [19]: ser.ip.isna
0     True
1    False
2    False
dtype: bool

In [20]: df['addresses'].ip.is_ipv6
0    False
1    False
2     True
Name: addresses, dtype: bool