Data Web API feed – How to update time series?

Introduction

This document outlines several methods to update your universe of time series leveraging the Macrobond’s Python library. Please note, if you are using other languages than Python, or prefer to define your own functions, our Data Web API’s HTTPS endpoints support the same mechanisms as described below.

The Data Web API can be used as a data feed service and the methods below are designed to update the time series of interest in your processes or databases efficiently and quickly.

A ‘time series’ in this context consists of a unique identifier, a list of date/value pairs and a set of metadata. Any updates will always contain all this data as one atomic unit.

Updating a time series consists of the retrieval of the most up to date version of a time series identified by its unique identifier, ‘primname’, from the Macrobond database. An update can be done for several reasons. For example, it can be related to a new data point being released, or historical data points being revised by the source, or an update of the metadata.

You should always keep track of the time when each time series was last updated. This allows you to do conditional downloads and to determine if a series has changed. The timestamp to use for this purpose is part of the metadata of each series in the form of the attribute ‘LastModifiedTimeStamp’.

Methods #1 and #2 below are dedicated to users subscribing to Macrobond’s Pay-as-you-Go model via an annual allowance of Unique Time Series (UTS). Method #3 is dedicated to users subscribing to one of the Macrobond’s data packages available here. Users from scenario #3 can also use methods #1 and #2 but the contrary is not true.

This document is for illustration purposes only. Please contact support@macrobond.com for any questions.

Universe Update (get_many_series)

You can call get_many_series to download the most up to date time series for new series or for updating existing series.

Input Parameters

The method accepts three parameters: [primname, ifmodifiedsince (optional)], and include_not_modified

  • primname: The name or alias of the timeseries
  • ifmodifiedsince (optional):
    • The date and time of the LastModifiedTimeStamp from a previous request. It can be left blank if it is the first download. Recording the LastModifiedTimeStamp of a time series separately is recommended for the next updates.
    • The function will compare this parameter with the underlying timeseries’ parameter LastModifiedTimeStamp. If the input is earlier than the new LastModifiedTimeStamp, the time series will be fetched.
  • include_not_modified: Set as False by default. If set to True, the response will include information about all the time series, even if they have not been modified/updated since the previous request (in which case only name, error_message and status_code 304 will be returned).

Response

It will return series entity with dates, values, and metadata. It also exposes further response codes including is_error, status_code, error_message, so you can always check the status of the response:

  • 200 = OK (All is well, full content returned)
  • 304 = NotModified (The item was not modified and is not included in the response)
  • 403 = Forbidden (Access to the item was denied)
  • 404 = NotFound (The item was not found)
  • 500 = Other (There was an error and it is described in the error text)

In addition to the properties, there are multiple helper methods for the returned entity object. For example, you can use to_dict() to return a dictionary or use values_to_pd_data_frame() to represent it as a Pandas DataFrame.

More Information

This method can be automated by users as part of their retrieval script, but Macrobond also recommends looking into the next method ‘SubscriptionList’ for further automation.
For more information, please check get_many_series document.

For detailed example, please see the notebook hosted on Macrobond’s GitHub repository: 1.3 - Macrobond Data API - Fetching multiple Time Series

 

Universe Update with Revision History (get_many_series_with_revisions)

You can call get_many_series_with_revisions to download and update your universe with revision history data (corresponding to a Point-in-Time representation of a time series; also known as ‘vintage’ time series)

Input Parameters

The method accepts two parameters: Sequence[RevisionHistoryRequest], and include_not_modified

  • RevisionHistoryRequest: A class that represents a series request and has the following fields:
    • name: The name or alias of the timeseries, full result will be returned if parameters below are left blank
    • ifmodifiedsince (optional):
      • The date and time of the LastModifiedTimeStamp from a previous request. It can be left blank if it is the first download. Recording the LastModifiedTimeStamp of a time series separately is recommended for the next updates.
      • The function will compare this parameter with the underlying timeseries’ parameter LastModifiedTimeStamp. If the input is earlier than the new LastModifiedTimeStamp, the time series will be fetched.
    • last_revision (optional):
      • The date and time of the LastRevisionTimeStamp from a previous request.
      • If specified, the function will compare this parameter with the underlying timeseries’ metadata LastRevisionTimeStamp. If the input is earlier than the latest LastReivisionTimeStamp, the function will return the contents which RevisionTimeStamp is later than last_revision input. PartialContent (206) will be returned in that case.
      • last_revision can only be specified with ifmodifiedsince parameter being included.
      • If LastRevisionAdjustmentTimeStamp exists in the underlying time series, then last_revision can only be specified along with last_revision_adjustment parameter being included. Otherwise, full content(200) will be returned.
    • last_revision_adjustment (optional):
      • The date and time of the LastRevisionAdjustmentTimeStamp from a previous request (If any).
      • If specified, the function will compare this with LastRevisionAdjustmentTimeStamp. If the input is earlier, then full content (200) will be returned to sync the modification of the whole revision history. Otherwise, the result depends on the input of last_reivision.
    • Include_not_modified: Set as False by default. If set to True, the response will include information about all the time series, even if they have not been modified/updated since the previous request (in which case only name, error_message and status_code 304 will be returned).

Response

The result is an object of class SeriesWithVintages, which fields are:

  • vintages: A list of object of class VintageValues, which fields are:
    • dates
    • values
    • vintage_time_stamp
  • status_code:
    • 200 = OK (All is well, full content returned)
    • 206 = PartialContent (The operation was successful, but only new revisions are included)
    • 304 = NotModified (The item was not modified and is not included in the response)
    • 403 = Forbidden (Access to the item was denied)
    • 404 = NotFound (The item was not found)
    • 500 = Other (There was an error and it is described in the error text)
  • error_text: The error text if there was an error specified.
  • last_modified: The timestamp of the last modification to be used in the next call to.
  • last_revision: The timestamp of the last revision to be used in the next call to.
  • last_revision_adjustment: The timestamp of the last revision adjustment to be used in the next call to
  • metadata: The metadata of the underlying time series
  • primary_name: The PrimName of the time series.

More Information

For more information, please check get_many_series_with_revisions document.

For detailed example, please see the notebook hosted on Macrobond’s GitHub repository: 4.1 - Macrobond Data API - Revision History.ipynb

Polling for updates (SubscriptionList)

SubscriptionList allows you to define your universe of time series and efficiently poll for updates with a minimal latency. In practice, you use this in an infinite loop by calling the ‘poll’ method and pass the timeStampForIfModifiedSince returned by the last poll. This method allows you to create a master list containing as many time series as subscribed to via the annual data allowance plan in terms of number of Unique Time Series (UTS).

When you add a new time series to the list, you should download the series first as described in section 1 or 2, add it to your database and only then include it in the subscription list.

Methods

  • set: Declare which series to include in the subscription list
  • add: Add one or more series to the subscription list
  • remove: Remove one or more series from the subscription list
  • poll: Polls for any changes on the series in the subscription list. The API endpoint uses subscriptionlist/getupdates, which returns three variables:
    • timeStampForIfModifiedSince: Timestamp for next poll
    • noMoreChanges: If true, the poll loop will rest for 15 seconds (default)
    • entities: A list of entity names and timestamps when last modified

Input Parameters

  • last_modified: The date as to when the subscription list was last modified.

Response

If there are any updates, the poll will return a dictionary of primary keys that have been updated, and the corresponding timestamp of the last update. If there are no updates, the method will return an empty dictionary after the poll interval time. This gives an opportunity to abort the polling loop.

You should compare the time stamp returned with the time stamp you have stored in your database to determine if the series is actually updated compared to the version that is stored. In rare cases, the polling may include some series that have not been updated.

More Information

For more information, please check subscription_list.
For detailed example, please see the Python script hosted on Macrobond’s GitHub repository: subscribing_to_updates.py

  • Please, note that for substantially larger universes, the polling frequency can be than 15 seconds e.g. 30 seconds, 60 seconds etc.

Poll for and Retrieve updates (GetDataPackageList)

This method is designed for Data Web API customers subscribing to data packages only (see https://www.macrobond.com/data-feed-packages). It will not work for subscribers of the Pay-as-you-Go model. Users can synchronize the updates of time series in their subscribed package to their environment in an automated and efficient way.

Methods

get_data_package_list is available for small packages including the Global Key Indicators package.

get_data_package_list_chunked should be used for large packages. This will process the data package list in chunks. This is more efficient since the complete list does not have to be in memory and it can be processed while downloading.

Both methods are calling the same API endpoint: series/getdatapackagelist.

Input parameters

  • if_modified_since: Specify a timestamp to see what has changed since then. Please note, in order to get all items returned, please leave the parameter as None. This parameter should only be the last returned timeStampForIfModifiedSince. It will not work with any other arbitrary time stamp.
  • chunk_size (for method get_data_package_list_chunked): The maximum number of items to include in each List in DataPackageListContext.items.

Response

  • When using method get_data_package_list, class DataPackageList will be returned, where variable is a list of class DataPackageListItem, including the entity name and timestamp when entity was last modified.
  • When using method get_data_package_list_chunked, class DataPackageListContextManager will be returned. Please make sure it is being assigned in a “with as” statement.
  • Both methods’ response includes:
    • timeStampForIfModifiedSince: A timestamp to pass as the ifModifiedSince parameter in the next request to get incremental updates
    • downloadFullListOnOrAfter: Earliest recommended next poll for a full list, by omitting timeStampForIfModifiedSince.
    • state:
      • 0 = FullListing (A complete listing of all series. Make another request for full data at some point after timestamp in downloadFullListOnOrAfter)
      • 1 = UpToDate (The list contains all updates since the specified start date. Wait 15 minutes before making another request where timeStampForIfModifiedSince is used)
      • 2 = Incomplete (The list might not contain all updates. Wait one minute and then use the timeStampForIfModifiedSince in a new request)
    • entities: The main result, which is the list of entity names and timestamps when the entities were last modified.

More Information

For more information, please check get_data_package_list and get_data_package_list_chunked

Additional Method (current, previous, and differential csv files)

Customers subscribing to a data package can also request from their Macrobond account team their ‘unique_id’. Once provided, customers can retrieve 3 .csv files exposing the current, previous, and differential universe through the below URLs:

https://datapackagelist.api.macrobondfinancial.com/{unique_id}_current.csv
https://datapackagelist.api.macrobondfinancial.com/{unique_id}_previous.csv
https://datapackagelist.api.macrobondfinancial.com/{unique_id}_diff.csv

In current and previous file, customers will find columns ‘name’ and ‘state’. In diff file customers will find ‘previous_name’, ‘previous_state’, ‘current_name’, and ‘current_state’.

When previous_name and previous_state are missing – given series just appeared on the whitelist.

When current_name and current_state are missing – given series are no longer available.

Note: on the first run, there will be only current file available. Previous and diff files will be ready since second execution.