ms_graph_exporter.ms_graph package

Submodules

ms_graph_exporter.ms_graph.api module

Module implements MsGraph class to perform authenticated queries to the API.

MsGraph obtains an OAuth 2.0 token from Azure AD with Service Principal for subsequent non-interactive authentication with the API endpoint. It also maintains persistent HTTPS session to the endpoint for efficient network communications.

Example

Use the MsGraph like this:

from datetime import datetime, timedelta
from os import environ
from logging import Logger, getLogger
from typing import Any, Dict, List

from ms_graph_exporter.ms_graph.api import MsGraph
from ms_graph_exporter.ms_graph.response import MsGraphResponse

logger: Logger
graph: MsGraph
t_now: datetime
response: MsGraphResponse
batch: List
record: Dict[str, Any]

logger = getLogger(__name__)

graph = MsGraph(
    client_id=environ.get("GRAPH_CLIENT_ID"),
    client_secret=environ.get("GRAPH_CLIENT_SECRET"),
    tenant=environ.get("GRAPH_TENANT"),
)

t_now = datetime.utcnow()

response = graph.get_signins(
    user_id="badc0ffe42@cafe.com",
    timestamp_start=(t_now - timedelta(minutes=10)),
    timestamp_end=(t_now - timedelta(minutes=5)),
    page_size=50
)

for batch in response:
    for record in batch:
        logger.info(
            "%s: %s: %s",
            signins,
            record["id"],
            record["ipAddress"],
        )
class ms_graph_exporter.ms_graph.api.MsGraph(client_id='', client_secret='', tenant='', *args, **kwargs)[source]

Bases: object

Class to maintain authenticated connection, and post queries to MS Graph API.

Authenticates with Azure AD, maintains an OAuth 2.0 token and HTTP session with connection pool to interact with MS Graph API.

Variables
  • __api_endpoint (str) – MS Graph API endpoint to call.

  • __api_version (str) – MS Graph API version to call.

  • __logger (Logger) – Channel to be used for log output specific to the module.

  • __throttling_retries (int) – Number of retries when getting API throttling response.

  • _auth_context (AuthenticationContext) – Authentication context maintained by MS ADAL for Python. Manages OAuth 2.0 token cache and refreshes it if necessary.

  • _authority_url (str) – A URL that identifies a token authority. Should be of the format https://login.microsoftonline.com/your_tenant

  • _client_id (str) – The OAuth client id of the calling application. (appId part of the Service Principal)

  • _client_secret (str) – The OAuth client secret of the calling application. (password part of the Service Principal)

  • _tenant (str) – The Azure AD tenant granting the token and where the calling application is registered. Can be in GUID or friendly name format. (tenant part of the Service Principal)

  • _uuid (str) – Universally unique identifier of the class instance to be used in logging.

__init__(client_id='', client_secret='', tenant='', *args, **kwargs)[source]

Initialize class instance.

Parameters
  • client_id (str) – The OAuth client id of the calling application. (appId part of the Service Principal)

  • client_secret (str) – The OAuth client secret of the calling application. (password part of the Service Principal)

  • tenant (str) – The Azure AD tenant granting the token and where the calling application is registered. Can be in GUID or friendly name format. (tenant part of the Service Principal)

  • *args – Variable length argument list for possible extension with subclass.

  • **kwargs – Arbitrary keyword arguments for possible extension with subclass.

Return type

None

__repr__()[source]

Return string representation of class instance.

Return type

str

_build_filter(filter_options=None, filter_join_op='and')[source]

Build OData filter.

Construct an OData filter from a list of tuples with filtering parameters expressions and values.

Parameters
  • filter_options (Optional[List[Tuple[str, str, str]]]) – List of ("option", "operand", "value") tuples to construct OData request filter by joining them with filter_join_op operator.

  • filter_join_op (str) – Logic operator (i.e. or or and) to join filter_options with.

Note

filter_options parameter follows this structure:

[
    ("userPrincipalName", "eq", "badc0ffe42@cafe.com"),
    ("createdDateTime",   "ge", "2019-07-26T02:02:02Z"),
    ("createdDateTime",   "le", "2019-07-26T04:04:04Z"),
]

So, startwith() expression must be presented as:

("startwith(userPrincipalName, 'badcafe')", "", "")
Returns

OData filter.

Return type

str

_http_get_with_auth(api_url, params=None)[source]

Perform authenticated GET request.

Request api_url with params using available OAuth 2.0 token and a connection pool from the established HTTP session.

Note

MS Graph sends HTTP 429 code to signal API throttling with Retry-After HTTP header specifying the wait period in seconds.

Throttling is handled by sleeping for Retry-After seconds and retrying again up to __throttling_retries times.

Parameters
  • api_url (str) – URL to be requested with available OAuth 2.0 token.

  • params (Optional[Dict[str, Any]]) – URL parameters to supply with the request.

Returns

HTTP response to the authenticated GET request.

Return type

Response

_query_api(resource=None, odata_filter=None, page_size=None, cache_enabled=False)[source]

Query MS Graph API.

Perform authenticated request to MS Graph resource with odata_filter. Returns paginated response, if page_size is defined and greater than zero.

Parameters
  • resource (Optional[str]) – Resource/relationship to be queried from MS Graph API endpoint (e.g. me/messages). Expected to start with resource name, but not with /.

  • odata_filter (Optional[str]) – Query filter allowing to retrieve a subset of available data (e.g. createdDateTime le 2019-07-14T05:20:00).

  • page_size (Optional[int]) –

    Number of records to be returned in a single (paginated) response. See paging in MS Graph API for more details.

  • cache_enabled (bool) – Flag indicating if response data should be cached (True) or not (False).

Note

If all requested records do not fit into the initial response, iterating through MsGraphResponse instance would be needed to retrieve all available records in batches of page_size size.

Returns

Response which (depending on the page_size) would either contain a full set of returned records, or just the first batch cached and an iterator to get all the subsequent paginated results.

Return type

MsGraphResponse

_query_api_time_domain(resource=None, filter_options=None, filter_join_op='and', timestamp_start=None, timestamp_end=None, page_size=None, cache_enabled=False)[source]

Query time-domain records from MS Graph API.

Request resource for the time-frame starting at timestamp_start and ending at timestamp_end. Returns paginated response, if page_size is defined.

Note

  • Queries all available data up to timestamp_end, if timestamp_start is not defined.

  • Without timestamp_end defined, gets the data up to the moment of the query execution minus intrinsic (~2 minutes) data population delay of the API.

Parameters
  • resource (Optional[str]) – Resource/relationship to be queried from MS Graph API endpoint (e.g. me/messages). Expected to start with resource name, but not with /.

  • filter_options (Optional[List[Tuple[str, str, str]]]) – List of ("option", "operand", "value") tuples to construct OData request filter by joining them with filter_join_op operator.

  • filter_join_op (str) – Logic operator (i.e. or or and) to join filter_options with.

  • timestamp_start (Optional[datetime]) – Limit results to records with greater or equal createdDateTime values. See Note section below for more details.

  • timestamp_end (Optional[datetime]) – Limit results to records with lower or equal createdDateTime values.

  • page_size (Optional[int]) – Number of records to be returned in a single batch (paginated) response. See Note section below for more details.

  • cache_enabled (bool) – Flag indicating if response data should be cached (True) or not (False).

Note

MS Graph API v1.0 output below suggests that API operates with timestamps of 0.1 microsecond precision, resulting in 7 digits after the point for createdDateTime.

{
    "@odata.context": "...snip...",
    "value": [
        {
            "id": "cafebabe-4242-4242-cafe-badc0ffebabe",
            "createdDateTime": "2019-07-21T22:05:58.8424069Z",
            "...snip...": "...snip..."
        }
    ]
}

On the other hand, Python datetime module allows to define time down to 1 microsecond precision, resulting in 6 digits after the point.

>>> import datetime
>>> timestamp_start = datetime.datetime(2019, 8, 24, 21, 10, 30, 9999999)
Traceback (most recent call last):
    ...
ValueError: microsecond must be in 0..999999
>>> timestamp_start = datetime.datetime(2019, 8, 24, 21, 10, 30, 999999)
>>> timestamp_start.isoformat()
'2019-08-24T21:10:30.999999'

To ensure no records are missed due to difference in timestamp precision between datetime module and the API, both timestamp_start and timestamp_end parameters are truncated to seconds (microseconds value is replaced by 0) for $filter construction. Then .0000000 is added to the string representation of timestamp_start and .9999999 is added to timestamp_end.

In other words, request time-frame always starts at the beginning of a 0.1 microsecond and ends at the end of the 0.1 microsecond as defined by the precision of the MS Graph API v1.0 timestamps.

Returns

A response which (depending on the page_size) would either contain a full set of returned records, or just the first batch cached and an iterator to get all the subsequent paginated results.

Return type

MsGraphResponse

get_signins(user_id=None, timestamp_start=None, timestamp_end=None, page_size=None, cache_enabled=False)[source]

Get Azure AD signin log records from MS Graph API.

Request user_id login data for the time-frame starting at timestamp_start and ending at timestamp_end. Returns paginated response, if page_size is defined.

Parameters
  • user_id (Optional[str]) – Limit results to records with userPrincipalName equal to user_id.

  • timestamp_start (Optional[datetime]) – Limit results to records with greater or equal createdDateTime values. See _query_api_time_domain() for more details.

  • timestamp_end (Optional[datetime]) – Limit results to records with lower or equal createdDateTime values.

  • page_size (Optional[int]) – Number of records to be returned in a single batch (paginated) response. See _query_api_time_domain() for more details.

  • cache_enabled (bool) – Flag indicating if response data should be cached (True) or not (False).

Returns

A response which (depending on the page_size) would either contain a full set of returned records, or just the first batch cached and an iterator to get all the subsequent paginated results.

Return type

MsGraphResponse

property http_session

Provide access to HTTP session instance.

Open persistent HTTP session with connection pool, if one does not exist yet.

Returns

Persistent HTTP session instance.

Return type

Session

property token

Token to interact with MS Graph API.

Use client credentials within ADAL authentication context to get OAuth 2.0 token from cache if not expired or re-request token otherwise.

Returns

Cached or newly (re-)issued OAuth 2.0 token. Token obtained with client credentials has the following structure:

{
    "tokenType": "Bearer",
    "expiresIn": 3600,
    "expiresOn": "2019-07-12 23:38:57.541597",
    "resource": "https://graph.microsoft.com",
    "accessToken": "{{token string}}",
    "isMRRT": True,
    "_clientId": "cafebabe-4242-4242-cafe-badc0ffebabe",
    "_authority": "https://login.microsoftonline.com/cafebabe-4242-4242-cafe-badc0ffebabe" # noqa: E501
}

Return type

Dict [str, Any]

ms_graph_exporter.ms_graph.response module

Module implements MsGraphResponse class to handle MS Graph API responses.

MsGraphResponse maintains context to allow efficient retrieval of paginated responses to a query.

class ms_graph_exporter.ms_graph.response.MsGraphResponse(ms_graph, initial_data, initial_url, cache_enabled=False)[source]

Bases: object

Class to handle MS Graph API responses.

Store data from a single query to MS Graph API. Maintain reference to specific MsGraph instance which initiated the query and uses it to retrieve subsequent parts of the paginated response.

Variables
  • __logger (Logger) – Channel to be used for log output specific to the module.

  • _cache (Dict [str, Optional [Dict [str, Any]]]) – Dictionary holding URLs queried and corresponding results received (if caching enabled), including URLs paged through with __next__().

  • _cache_enabled (bool) – Flag indicating if received data should be cached (True) or not (False).

  • _complete (bool) – Flag indicating if the response is complete (True) or partial (False) and there are more paginated records to fetch.

  • _data_page (List [Dict [str, Any]]) – Last data batch fetched from the API.

  • _initial_url (str) – URL to retrieve the initial data batch.

  • _ms_graph (MsGraph) – API instance to be used for queries.

  • _next_url (str) – URL to retrieve next data batch if response is paginated.

  • _uuid (str) – Universally unique identifier of the class instance to be used in logging.

Note

Even if caching is disabled, but response contains a single page which has been retrieved and provided at instantiation of MsGraphResponse, the data is taken from memory for subsequent iterations with __iter__() and __next__() and not re-requested.

__init__(ms_graph, initial_data, initial_url, cache_enabled=False)[source]

Initialize class instance.

Parameters
  • ms_graph (MsGraph) – MS Graph API client instance to be used for queries.

  • initial_data (Optional[Dict[str, Any]]) – Data structure returned by MS Graph API from initial query.

  • initial_url (str) – Initial query URL producing initial_data.

  • cache_enabled (bool) – Flag indicating if response data should be cached (True) or not (False).

Return type

None

__iter__()[source]

Provide iterator for object.

Prepares internal state for iteration from the beginning of the data set and returns object itself as an iterator.

__next__()[source]

Return cached data and prefetch more, if available.

Return type

List[Dict[str, Any]]

__repr__()[source]

Return string representation of class instance.

Return type

str

_prefetch_next()[source]

Prefetch more responses.

Prefetch next data batch, if more paginated records are available.

Return type

None

_update(api_response, query_url)[source]

Update internal state.

Save the latest api_response received, ensure consistency of the internal metadata and push to cache if enabled.

Raises

ValueError – If api_response does not have value key where the list with response data must reside (even if empty).

Return type

None

Module contents

Package contains following modules.