Usage¶
This document describes how to use the methods and classes provided by
cyberpandas
.
We’ll assume that the following imports have been performed.
In [1]: import ipaddress
In [2]: import pandas as pd
In [3]: from cyberpandas import IPArray, to_ipaddress
Parsing¶
First, you’ll need some IP Address data. Much like pandas’
pandas.to_datetime()
, cyberpandas
provides to_ipaddress()
for
converting sequences of anything to a specialized array, IPArray
in
this case.
From Strings¶
to_ipaddress()
can parse a sequence strings where each element represents
an IP address.
In [4]: to_ipaddress([
...: '192.168.1.1',
...: '2001:0db8:85a3:0000:0000:8a2e:0370:7334',
...: ])
...:
Out[4]: IPArray(['192.168.1.1', '2001:db8:85a3::8a2e:370:7334'])
You can also parse a container of bytes (Python 2 parlance).
In [5]: to_ipaddress([
...: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xc0\xa8\x01\x01',
...: b' \x01\r\xb8\x85\xa3\x00\x00\x00\x00\x8a.\x03ps4',
...: ])
...:
Out[5]: IPArray(['192.168.1.1', '2001:db8:85a3::8a2e:370:7334'])
If you have a buffer / bytestring, see From Bytes.
From Integers¶
IP Addresses are just integers, and to_ipaddress()
can parse a sequence of
them.
In [6]: to_ipaddress([
...: 3232235777,
...: 42540766452641154071740215577757643572
...: ])
...:
Out[6]: IPArray(['192.168.1.1', '2001:db8:85a3::8a2e:370:7334'])
There’s also the IPArray.from_pyints()
method that does the same thing.
From Bytes¶
If you have a correctly structured buffer of bytes or bytestring, you can
directly construct an IPArray
without any intermediate copies.
In [7]: stream = (b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xc0\xa8\x01'
...: b'\x01 \x01\r\xb8\x85\xa3\x00\x00\x00\x00\x8a.\x03ps4')
...:
In [8]: IPArray.from_bytes(stream)
Out[8]: IPArray(['192.168.1.1', '2001:db8:85a3::8a2e:370:7334'])
stream
is expected to be a sequence of bytes representing IP Addresses (note
that it’s just a bytestring that’s be split across two lines for readability).
Each IP Address should be 128 bits, left padded with 0s for IPv4 addresses.
In particular, IPArray.to_bytes()
produces such a sequence of bytes.
Pandas Integration¶
IPArray
satisfies pandas extension array interface, which means that it can
safely be stored inside pandas’ Series and DataFrame.
In [9]: values = to_ipaddress([
...: 0,
...: 3232235777,
...: 42540766452641154071740215577757643572
...: ])
...:
In [10]: values
Out[10]: IPArray(['0.0.0.0', '192.168.1.1', '2001:db8:85a3::8a2e:370:7334'])
In [11]: ser = pd.Series(values)
In [12]: ser
Out[12]:
0 0.0.0.0
1 192.168.1.1
2 2001:db8:85a3::8a2e:370:7334
dtype: ip
In [13]: df = pd.DataFrame({"addresses": values})
In [14]: df
Out[14]:
addresses
0 0.0.0.0
1 192.168.1.1
2 2001:db8:85a3::8a2e:370:7334
Most pandas methods that make sense should work. The following section will call out points of interest.
Indexing¶
If your selection returns a scalar, you get back an
ipaddress.IPv4Address
or ipaddress.IPv6Address
.
In [15]: ser[0]
Out[15]: IPv4Address('0.0.0.0')
In [16]: df.loc[2, 'addresses']