Databricks feed

Introduction

You can access any of Macrobond’s time series data into your Databricks environment. The data feed is intended for production purposes with near real-time updates.

Here’s how it works:

Data discovery

The Databricks data feed is used for production purposes only. Macrobond delivers the time series chosen by the consumer into their Databricks environment. There are various options to explore the Macrobond database and find the time series to be delivered into Databricks:

  1. Using Macrobond.net data catalogue. The Macrobond.net is a read only web portal that offers access to the tree structure and time series associated with every source included in the Macrobond base package.. Users can use it to explore the coverage and start documenting the primnames (unique identifiers) of the time series they would like to consume via Databricks.
  2. Using the Macrobond application. If you already subscribe to Core or Data+, the principles are the same as 1. above. If you subscribe to the desktop application, you have access to many more features including the analytics and sharing capabilities. Documenting a universe of time series via the Macrobond application can also be easier as it allows you to download the data in Excel or via API (Data+).
  3. Using the Macrobond Data Web API. Should you need to evaluate programmatically the time series data points, the Data Web API can also be used as a data feed product instead of being only an enabler to using the Databricks data feed.
  4. Using Macrobond’s data team. You can express various parameters that fit into your requirements so that Macrobond’s data team can generate a list of time series by converting these parameters into Macrobond-defined metadata fields. This allows you to evaluate the full database at scale. To ensure you access your definite universe, our team will discuss with you your exact requirements.

Set up your data feed in the Macrobond management portal

Once you know which time series you would like to receive in Databricks, or with the initial data set you are interested in, you can request the primnames through a dedicated portal provided by Macrobond.

The Macrobond team will provide access to this URL: https://feed.macrobond.com/login and send via email your Username and Password.

Here is an overview of the key features:

  1. Monitor your data feed properties

    Your contractual allowance appears under “Unique Time Series limit”. You can use your feed during your subscription term (annual basis). You have an allowance of how many time series you can add or remove from your feed via the “Changes Per Month” counter, which corresponds to 10% of your total allowance.
  2. Access your existing Subscription List and visualize or download the list of primnames you are subscribed to, including a flag denoting whether the series are feedable or not to be delivered via the feed (compliance module).
  3. Add time series to your data feed. You can add the primnames with a line break or decide to upload a list from a .csv file.
  4. Remove time series from your data feed.

Use your Databricks data feed

Once the time series are added into the Macrobond data feed management portal, the series are immediately delivered into your Databricks environment. Note that depending on the volume of time series, this operation can take some time to load.

Time series updates are also provided into Databricks with minimal latency as they are automatically pushed by Macrobond as soon as an entity received a complete update transaction in the backend.

The Pay-As-You-Go Data feed in Databricks provides access to a few tables and views. A data dictionary and sample notebook are available on the Databricks listing.

Here is an overview of the schema. Columns that are repeated across multiple tables or views will be skipped once described in a table definition.

MAIN_ENTITY [Table]

It delivers the full dataset but should not be used by users because it is redundant with the following tables or views. Instead, you should use either LATEST_ENTITY [View], derived from the MAIN_ENTITY table to obtain the latest version of your time series, or REVISION [Table] to obtain the vintages.

  • PRIMNAME: unique identifier provided by Macrobond to identify a time series.
  • METADATA: nest list of metadata fields for each time series containing lists or dictionaries.
  • DATES: list of observation periods for the time series. Note that Macrobond uses the beginning of a period to denote the reference date. For instance, 2024-04-01 means “Q2 2024” in the case of a quarterly time series.
  • OBSERVATION_VALUES: list of data points for the time series.
  • ULID: a column intended to be used to join to REVISION (in conjunction with a PRIMNAME). This is a unique identifier for simultaneously transmitted time series (belonging to the same batch when loaded into Databricks). A newer batch of uploads always increments this value. In a given batch of uploads, a given time series can appear only once.
  • NUMBER_OF_REVISIONS: number of unique vintages for the selected time series. The underlying data can be consumed in the REVISION table.
  • LAST_UPDATE: time stamp when the time series was fully delivered and accessible in Databricks
  • FEED_QUEUE_INSERT_TIMESTAMP: time stamp when Macrobond-designed Databricks worker added the time series to the insert queue as part of a loading batch.

REVISION [Table]

Some columns are already described in the previous table definition. New columns are:

  • VINTAGETIMESTAMP: time stamp as to when this version of the time series containing data that could have been revision afterward it was made available in Macrobond.

Note that some time series have received a backfill to extend the point-in-time history directly from the source. As such, most of the time stamps prior to 2018 are arbitrary (as opposed to the date stamps).

To understand whether a time series received a backfill or not, you can refer to the time series metadata and look for fields: “RevisionHistorySource” and “RevisionHistorySourceCutOffDate”. When both fields exist, the time series was backfilled. The spread between the cut off date and the “FirstRevisionTimeStamp” provides the magnitude of the backfill. Revisions post cut off date are all captured and stored by Macrobond, not by the source.

When VINTAGETIMESTAMP = null, it corresponds to the first unrevised version of a time series as Macrobond knows it. We do not know at which point in time this version was published, until there is a first revision, captured by the subsequent VINTAGETIMESTAMP = FirstRevisionTimeStamp in the metadata from when the information is known.

In summary, every time series capturing Revision History in Macrobond starts with a first vintage time stamps = null.

  • VINTAGEDATES: observation dates available as of the vintage time stamp.
  • VINTAGEVALUES: observation points available as of the vintage time stamp.

SUBSCRIPTION [View]

You see the same information as you do in your data feed management portal. This is explained in 2. above. The table is updated once a day.

  • PRIMNAME: requested primname. It can be an alias code.
  • ACTUAL_CODE: primname in the Macrobond database.
  • STATE: denotes the entity state. 0: active | 4: discontinued | -1: deleted
  • AVAILABLE_FOR_FEED: 1: compliant | 0: non feedable. In case of non feedable time series, you can prove to Macrobond you have permission or a data license with the source and Macrobond can create a feedability exception on your account.
  • LAST_UPDATE_IN_DATABRICKS: upload time stamp in Databricks.

LATEST_ENTITY [View]

Derived from the MAIN_ENTITY table, this view provides the latest version of your time series. It is the equivalent to the latest vintage available for time series storing Revision History.

Snowflake data feed

Introduction

You can access any of Macrobond’s time series data into your Snowflake environment. The data feed is intended for production purposes with near real-time updates.

Here’s how it works:

Data Discovery

The Snowflake data feed is used for production purposes only. Macrobond delivers the time series chosen by the consumer into their Snowflake environment. There are various options to explore the Macrobond database and find the time series to be delivered into Snowflake:

  1. Using Macrobond.net data catalogue. The Macrobond.net is a read only web portal that offers access to the tree structure and time series associated with every source included in the Macrobond base package. Users can use it to explore the coverage and start documenting the primnames (unique identifiers) of the time series they would like to consume via Snowflake.
  2. Using the Macrobond application. If you already subscribe to Core or Data+, the principles are the same as 1. above. If you subscribe to the desktop application, you have access to many more features including the analytics and sharing capabilities. Documenting a universe of time series via the Macrobond application can also be easier as it allows you to download the data in Excel or via API (Data+).
  3. Using the Macrobond Data Web API. Should you need to evaluate programmatically the time series data points, the Data Web API can also be used as a data feed product instead of being only an enabler to using the Snowflake data feed.
  4. Using Macrobond’s data team. You can express various parameters that fit into your requirements so that Macrobond’s data team can generate a list of time series by converting these parameters into Macrobond-defined metadata fields. This allows you to evaluate the full database at scale. To ensure you access your definite universe, our team will discuss with you your exact requirements.

Set up your data feed in the Macrobond management portal

Once you know which time series you would like to receive in Snowflake, or with the initial data set you are interested in, you can request the primnames through a dedicated portal provided by Macrobond.

The Macrobond team will provide access to this URL: https://feed.macrobond.com/login and send you via email your Username and Password.

Note that you can access the same portal through multiple team members to adjust the settings of your data feed in Snowflake.

Here is an overview of the key functions:

  1. Monitor your data feed properties

    Your contractual allowance appears under “Unique Time Series limit”. You can use your feed during your subscription term (annual basis). You have an allowance of how many time series you can add or remove from your feed via the “Changes Per Month” counter, which corresponds to 10% of your total allowance.
  2. Access your existing Subscription List and visualize or download the list of primnames you are subscribed to, including a flag denoting whether the series are feedable or not (compliance checks).
  3. Add time series to your data feed. You can add the primnames with a line break or decide to upload a list previously stored in a csv format.
  4. Remove time series from your data feed.

Use your Snowflake data feed

Once the time series are added into the Macrobond data feed management portal, the series are immediately delivered into your Snowflake environment. Note that depending on the volume of time series, this operation can take some time to load.

Time series updates are also provided into Snowflake with minimal latency as they are automatically pushed by Macrobond as soon as an entity received a complete update transaction in our backend system.

The Pay-As-You-Go Data feed in Snowflake provides access to a few tables and views. Data dictionary is available on the Snowflake listing: https://app.snowflake.com/marketplace/listing/GZSYZQKOHJ/macrobond-financial-macrobond-pay-as-you-go-data

Here is an overview of the schema. Columns that are repeated across multiple tables or views will be skipped once described in a table definition.

Data Dictionary

MAIN_ENTITY [Table]

It delivers the full dataset but should not be used straight away by the users. Instead, you should use either LATEST_ENTITY [View], derived from the MAIN_ENTITY table to obtain the latest version of your time series, or REVISION [Table] to obtain the vintages.

  • PRIMNAME: unique identifier provided by Macrobond to identify a time series.
  • METADATA: nest list of metadata fields for each time series containing lists or dictionaries.
  • DATES: list of observation periods for the time series. Note that Macrobond uses the beginning of a period to denote the reference date. For instance, 2024-04-01 means “Q2 2024” in the case of a quarterly time series.
  • OBSERVATION_VALUES: list of data points for the time series.
  • ULID: a column intended to be used to join to REVISION (in conjunction with a PRIMNAME). This is a unique identifier for simultaneously transmitted time series (belonging to the same batch when loaded into Snowflake). A newer batch of uploads always increments this value. In a given batch of uploads, a given time series can appear only once.
  • NUMBER_OF_REVISIONS: number of unique vintages for the selected time series. The underlying data can be consumed in the REVISION table.
  • LAST_UPDATE: time stamp when the time series was fully delivered and accessible in Snowflake.
  • FEED_QUEUE_INSERT_TIMESTAMP: time stamp when Macrobond-designed Snowflake worker added the time series to the insert queue as part of a loading batch.

REVISION [Table]

Some columns are already described in the previous table definition. New columns are:

  • VINTAGETIMESTAMP: time stamp as to when this version of the time series containing data that could have been revision afterward it was made available in Macrobond.
    Note that some time series have received a backfill to extend the point-in-time history directly from the source. As such, most of the time stamps prior to 2018 are arbitrary (as opposed to the date stamps).
    To understand whether a time series received a backfill or not, you can refer to the time series metadata and look for fields: “RevisionHistorySource” and “RevisionHistorySourceCutOffDate”.
    When both fields exist, the time series was backfilled. The spread between the cut off date and the “FirstRevisionTimeStamp” provides the magnitude of the backfill. Revisions post cut off date are all captured and stored by us, not by the source.
    When VINTAGETIMESTAMP = null, it corresponds to the first unrevised version of a time series as Macrobond knows it. We do not know at which point in time this version was published, until there is a first revision, captured by the subsequent VINTAGETIMESTAMP = FirstRevisionTimeStamp in the metadata from when the information is known.
    In summary, every time series capturing Revision History in Macrobond starts with a first vintage time stamps = null.
  • VINTAGEDATES: observation dates available as of the vintage time stamp.
  • VINTAGEVALUES: observation points available as of the vintage time stamp.

SUBSCRIPTION [View]

You see the same information as you do in your data feed management portal. This is explained in 2. above. The table is updated once a day.

  • PRIMNAME: requested primname. It can be an alias code.
  • ACTUAL_CODE: primname in the Macrobond database.
  • STATE: denotes the entity state. 0: active | 4: discontinued | -1: deleted
  • AVAILABLE_FOR_FEED: 1: compliant | 0: non feedable. In case of non feedable time series, you can prove to Macrobond you have permission or a data license with the source and Macrobond can create a feedability exception on your account.
  • LAST_UPDATE_IN_SNOWFLAKE: upload time stamp in Snowflake.

LATEST_ENTITY [View]

Derived from the MAIN_ENTITY table, this view provides the latest version of your time series. It is the equivalent to the latest vintage available for time series storing Revision History.

Data Web API feed – How to update time series?

Introduction

This document outlines several methods to update your universe of time series leveraging the Macrobond’s Python library. Please note, if you are using other languages than Python, or prefer to define your own functions, our Data Web API’s HTTPS endpoints support the same mechanisms as described below.

The Data Web API can be used as a data feed service and the methods below are designed to update the time series of interest in your processes or databases efficiently and quickly.

A ‘time series’ in this context consists of a unique identifier, a list of date/value pairs and a set of metadata. Any updates will always contain all this data as one atomic unit.

Updating a time series consists of the retrieval of the most up to date version of a time series identified by its unique identifier, ‘primname’, from the Macrobond database. An update can be done for several reasons. For example, it can be related to a new data point being released, or historical data points being revised by the source, or an update of the metadata.

You should always keep track of the time when each time series was last updated. This allows you to do conditional downloads and to determine if a series has changed. The timestamp to use for this purpose is part of the metadata of each series in the form of the attribute ‘LastModifiedTimeStamp’.

Methods #1 and #2 below are dedicated to users subscribing to Macrobond’s Pay-as-you-Go model via an annual allowance of Unique Time Series (UTS). Method #3 is dedicated to users subscribing to one of the Macrobond’s data packages available here. Users from scenario #3 can also use methods #1 and #2 but the contrary is not true.

This document is for illustration purposes only. Please contact support@macrobond.com for any questions.

Universe Update (get_many_series)

You can call get_many_series to download the most up to date time series for new series or for updating existing series.

Input Parameters

The method accepts three parameters: [primname, ifmodifiedsince (optional)], and include_not_modified

  • primname: The name or alias of the timeseries
  • ifmodifiedsince (optional):
    • The date and time of the LastModifiedTimeStamp from a previous request. It can be left blank if it is the first download. Recording the LastModifiedTimeStamp of a time series separately is recommended for the next updates.
    • The function will compare this parameter with the underlying timeseries’ parameter LastModifiedTimeStamp. If the input is earlier than the new LastModifiedTimeStamp, the time series will be fetched.
  • include_not_modified: Set as False by default. If set to True, the response will include information about all the time series, even if they have not been modified/updated since the previous request (in which case only name, error_message and status_code 304 will be returned).

Response

It will return series entity with dates, values, and metadata. It also exposes further response codes including is_error, status_code, error_message, so you can always check the status of the response:

  • 200 = OK (All is well, full content returned)
  • 304 = NotModified (The item was not modified and is not included in the response)
  • 403 = Forbidden (Access to the item was denied)
  • 404 = NotFound (The item was not found)
  • 500 = Other (There was an error and it is described in the error text)

In addition to the properties, there are multiple helper methods for the returned entity object. For example, you can use to_dict() to return a dictionary or use values_to_pd_data_frame() to represent it as a Pandas DataFrame.

More Information

This method can be automated by users as part of their retrieval script, but Macrobond also recommends looking into the next method ‘SubscriptionList’ for further automation.
For more information, please check get_many_series document.

For detailed example, please see the notebook hosted on Macrobond’s GitHub repository: 1.3 - Macrobond Data API - Fetching multiple Time Series

 

Universe Update with Revision History (get_many_series_with_revisions)

You can call get_many_series_with_revisions to download and update your universe with revision history data (corresponding to a Point-in-Time representation of a time series; also known as ‘vintage’ time series)

Input Parameters

The method accepts two parameters: Sequence[RevisionHistoryRequest], and include_not_modified

  • RevisionHistoryRequest: A class that represents a series request and has the following fields:
    • name: The name or alias of the timeseries, full result will be returned if parameters below are left blank
    • ifmodifiedsince (optional):
      • The date and time of the LastModifiedTimeStamp from a previous request. It can be left blank if it is the first download. Recording the LastModifiedTimeStamp of a time series separately is recommended for the next updates.
      • The function will compare this parameter with the underlying timeseries’ parameter LastModifiedTimeStamp. If the input is earlier than the new LastModifiedTimeStamp, the time series will be fetched.
    • last_revision (optional):
      • The date and time of the LastRevisionTimeStamp from a previous request.
      • If specified, the function will compare this parameter with the underlying timeseries’ metadata LastRevisionTimeStamp. If the input is earlier than the latest LastReivisionTimeStamp, the function will return the contents which RevisionTimeStamp is later than last_revision input. PartialContent (206) will be returned in that case.
      • last_revision can only be specified with ifmodifiedsince parameter being included.
      • If LastRevisionAdjustmentTimeStamp exists in the underlying time series, then last_revision can only be specified along with last_revision_adjustment parameter being included. Otherwise, full content(200) will be returned.
    • last_revision_adjustment (optional):
      • The date and time of the LastRevisionAdjustmentTimeStamp from a previous request (If any).
      • If specified, the function will compare this with LastRevisionAdjustmentTimeStamp. If the input is earlier, then full content (200) will be returned to sync the modification of the whole revision history. Otherwise, the result depends on the input of last_reivision.
    • Include_not_modified: Set as False by default. If set to True, the response will include information about all the time series, even if they have not been modified/updated since the previous request (in which case only name, error_message and status_code 304 will be returned).

Response

The result is an object of class SeriesWithVintages, which fields are:

  • vintages: A list of object of class VintageValues, which fields are:
    • dates
    • values
    • vintage_time_stamp
  • status_code:
    • 200 = OK (All is well, full content returned)
    • 206 = PartialContent (The operation was successful, but only new revisions are included)
    • 304 = NotModified (The item was not modified and is not included in the response)
    • 403 = Forbidden (Access to the item was denied)
    • 404 = NotFound (The item was not found)
    • 500 = Other (There was an error and it is described in the error text)
  • error_text: The error text if there was an error specified.
  • last_modified: The timestamp of the last modification to be used in the next call to.
  • last_revision: The timestamp of the last revision to be used in the next call to.
  • last_revision_adjustment: The timestamp of the last revision adjustment to be used in the next call to
  • metadata: The metadata of the underlying time series
  • primary_name: The PrimName of the time series.

More Information

For more information, please check get_many_series_with_revisions document.

For detailed example, please see the notebook hosted on Macrobond’s GitHub repository: 4.1 - Macrobond Data API - Revision History.ipynb

Polling for updates (SubscriptionList)

SubscriptionList allows you to define your universe of time series and efficiently poll for updates with a minimal latency. In practice, you use this in an infinite loop by calling the ‘poll’ method and pass the timeStampForIfModifiedSince returned by the last poll. This method allows you to create a master list containing as many time series as subscribed to via the annual data allowance plan in terms of number of Unique Time Series (UTS).

When you add a new time series to the list, you should download the series first as described in section 1 or 2, add it to your database and only then include it in the subscription list.

Methods

  • set: Declare which series to include in the subscription list
  • add: Add one or more series to the subscription list
  • remove: Remove one or more series from the subscription list
  • poll: Polls for any changes on the series in the subscription list. The API endpoint uses subscriptionlist/getupdates, which returns three variables:
    • timeStampForIfModifiedSince: Timestamp for next poll
    • noMoreChanges: If true, the poll loop will rest for 15 seconds (default)
    • entities: A list of entity names and timestamps when last modified

Input Parameters

  • last_modified: The date as to when the subscription list was last modified.

Response

If there are any updates, the poll will return a dictionary of primary keys that have been updated, and the corresponding timestamp of the last update. If there are no updates, the method will return an empty dictionary after the poll interval time. This gives an opportunity to abort the polling loop.

You should compare the time stamp returned with the time stamp you have stored in your database to determine if the series is actually updated compared to the version that is stored. In rare cases, the polling may include some series that have not been updated.

More Information

For more information, please check subscription_list.
For detailed example, please see the Python script hosted on Macrobond’s GitHub repository: subscribing_to_updates.py

  • Please, note that for substantially larger universes, the polling frequency can be than 15 seconds e.g. 30 seconds, 60 seconds etc.

Poll for and Retrieve updates (GetDataPackageList)

This method is designed for Data Web API customers subscribing to data packages only (see https://www.macrobond.com/data-feed-packages). It will not work for subscribers of the Pay-as-you-Go model. Users can synchronize the updates of time series in their subscribed package to their environment in an automated and efficient way.

Methods

get_data_package_list is available for small packages including the Global Key Indicators package.

get_data_package_list_chunked should be used for large packages. This will process the data package list in chunks. This is more efficient since the complete list does not have to be in memory and it can be processed while downloading.

Both methods are calling the same API endpoint: series/getdatapackagelist.

Input parameters

  • if_modified_since: Specify a timestamp to see what has changed since then. Please note, in order to get all items returned, please leave the parameter as None. This parameter should only be the last returned timeStampForIfModifiedSince. It will not work with any other arbitrary time stamp.
  • chunk_size (for method get_data_package_list_chunked): The maximum number of items to include in each List in DataPackageListContext.items.

Response

  • When using method get_data_package_list, class DataPackageList will be returned, where variable is a list of class DataPackageListItem, including the entity name and timestamp when entity was last modified.
  • When using method get_data_package_list_chunked, class DataPackageListContextManager will be returned. Please make sure it is being assigned in a “with as” statement.
  • Both methods’ response includes:
    • timeStampForIfModifiedSince: A timestamp to pass as the ifModifiedSince parameter in the next request to get incremental updates
    • downloadFullListOnOrAfter: Earliest recommended next poll for a full list, by omitting timeStampForIfModifiedSince.
    • state:
      • 0 = FullListing (A complete listing of all series. Make another request for full data at some point after timestamp in downloadFullListOnOrAfter)
      • 1 = UpToDate (The list contains all updates since the specified start date. Wait 15 minutes before making another request where timeStampForIfModifiedSince is used)
      • 2 = Incomplete (The list might not contain all updates. Wait one minute and then use the timeStampForIfModifiedSince in a new request)
    • entities: The main result, which is the list of entity names and timestamps when the entities were last modified.

More Information

For more information, please check get_data_package_list and get_data_package_list_chunked

Additional Method (current, previous, and differential csv files)

Customers subscribing to a data package can also request from their Macrobond account team their ‘unique_id’. Once provided, customers can retrieve 3 .csv files exposing the current, previous, and differential universe through the below URLs:

https://datapackagelist.api.macrobondfinancial.com/{unique_id}_current.csv
https://datapackagelist.api.macrobondfinancial.com/{unique_id}_previous.csv
https://datapackagelist.api.macrobondfinancial.com/{unique_id}_diff.csv

In current and previous file, customers will find columns ‘name’ and ‘state’. In diff file customers will find ‘previous_name’, ‘previous_state’, ‘current_name’, and ‘current_state’.

When previous_name and previous_state are missing – given series just appeared on the whitelist.

When current_name and current_state are missing – given series are no longer available.

Note: on the first run, there will be only current file available. Previous and diff files will be ready since second execution.

The Macrobond FactSet Data Provider

Introduction

With the Macrobond FactSet Data Provider you can retrieve time series from FactSet if you have an account there.

Requirements

You must have a FactSet account that is enabled for the FactSet Formula API.

On Macrobond you need to have Core or Data+ license or Macrobond Viewer.

Enabling the FactSet Data Provider

To be able to use the FactSet Data Provider go to Configuration > Settings > tab 'My series' (for MB pre-1.28: Edit > Settings > tab 'My series'). Select Add > FactSet > Request/renew access.

New window from FactSet will open in browser, type in your login and password:

After that you will be asked to allow Macrobond Application access your FactSet account, press 'Allow':

Note there's a time limit for this operation. You will be notified with this message when time runs up :

Just press 'OK' and start again.

After allowing access you will see our 'FactSet authorization successful' page and confirmation in Macrobond Application:

Working with FactSet Data Provider

Series names

In application's Series and Expressions you can use form:

"ih:fds:id:formula"

where id and formula are FactSet's ID and formula. For example:

"ih:fds:aapl-usa:p_price(0,-10ay)"

FactSet Data Provider Enhancement

To help you create the series expressions, you can paste a FactSet "=FDS" expression from Excel or a URL from the Formula API Request Builder in the Macrobond application data browser.

For example, you can use:

=FDS("F-US","FG_PRICE(0)")

FactSet OFDB

It is possible to get data from an OFDB into Macrobond, provided the database is of type 3D (Timeseries/Iterated).

It is done by using an OFDB formula like this:

"ih:fds:aapl-us:ofdb(\"personal:/ofdb_test\",\"data\",0,-1am)"

which is the equivalent to formulas=OFDB.

Internally Macrobond will use the formula ofdb_cal_dates with the same OFDB parameters to get the calendar.

Revoking access

You can revoke access for the Macrobond application on the FactSet granted open-id services page.

Or you can do it from Macrobond Application. Go to Configuration (in upper menu) > Settings > tab 'My series' (for MB pre-1.28: Edit > Settings > tab 'My series'). Select FactSet > Revoke access.

The Macrobond Data API for Python (Python wrapper)

Introduction

The Macrobond Data API for Python uses either the Macrobond Web REST API or the Macrobond Client data API to obtain time series with values and metadata. The API consists of a set of functions in common between the underlying APIs as well as specialized functions unique to each implementation.

You get time series directly as Pandas series.

Requirements

This API can be used only with Data+ license.

Macrobond Data API for Python officially supports Python 3.6+.

Installation

Macrobond Data API for Python is available on PyPI:

python -m pip install macrobond-data-api

Keyring

When using WebClient it is recommended to use the system keyring. Keyring should only be necessary if you have a license which includes the Web API. If you have Data+ license it shouldn't be needed.

This can be done easily by running the include script using this command:

python -c "from macrobond_data_api.util import *; save_credentials_to_keyring()"

Supported keyrings:

Error: failed testing username and password

raise self.oauth_error_class(
authlib.integrations.base_client.errors.OAuthError: invalid_client:

Error: failed testing username and password

This error usually derives from two situations:

  1. You do not have the proper Macrobond license to use it – the Keyring should only be necessary if you have a license which includes the Web API.
    Using Python, R, EViews or MATLAB API you must use the COM API - you do not need to enter any passwords in them as they are fetched from the Macrobond application.
  2. The credentials are being mistyped. When using Windows Command Prompt to set up the Keyring it might be confusing as nothing shows up when you type the password - this is normal security mechanism. Also, by default, you cannot copy and paste in the password field, it must be typed manually.
    • To be able to copy the password in the Command Prompt you must enable it in the Command Prompt Properties. To do that, open Command Prompt, right-click on the title bar and select Properties. In here enable 'Use Crtl+Shift+C/V as Copy/Paste'.

Working with Python wrapper

Learn more about the Commonly used metadata in Macrobond.

Download one series (Basic usage)

import macrobond_data_api as mb_api

usgdp = mb_api.get_one_series("usgdp")

print(usgdp)

Download one series (Advanced usage)

#web
from macrobond_data_api.web import WebClient

with WebClient('client id', 'client secret') as api:
    series = api.get_one_series('usgdp')

print(series)
#com
from macrobond_data_api.com import ComClient

with ComClient() as api:
    series = api.get_one_series('usgdp')

print(series)

Extensive guide - more examples

We have prepared examples in Jupyter Notebooks to help you start using Python wrapper. See them on our Github.

Web API Series Provider demo Virtual Machine

Introduction

You can test our Web API Series Provider functionality with a view of test database.

Requirements

You will need xml path file which you will receive from your Account Manager.

How to connect?

  1. Download xml file(s) you have received.
  2. Go to Configuration > Settings > My series (for MB pre-1.28: Edit > Settings > My series), new window will appear
    1. select 'Add' > 'Web API Series Provider'
    2. enter information as in the screenshot:
      Name: seriesprovider
      Prefix: sp
      Configuration file path: navigate to xml files you have received 
    3. Press 'OK'

Accessing data

You can access this data through Browse or Analytics. Please go to drop-down menu and select newly added database.

For more information see The Macrobond client Web API Series Provider.

SQL In-house – connection strings

Introduction

Macrobond SQL Database connector configuration file includes a section specifying data provider database connection string and connection string. It's unique to each database type. This page includes examples for most commonly used databases.

MySQL and MariaDB

Please install https://dev.mysql.com/downloads/connector/net/ and use:

<sql:DataProvider>
 <sql:Name>MySql.Data.MySqlClient</sql:Name>
 <sql:ConnectionString>
  Server=address.of.the.server;
  Database=dbname;
  Uid=user;Pwd=pass;​
 </sql:ConnectionString>  
 <!-- instead of  Uid=user;Pwd=pass;​ you can use  Uid={0};Pwd={1};​ and and let Macrobond 
  pass to the connection string credentials entered by the user from the in-house 
  configuration dialog -->
</sql:DataProvider>

MS SQL Server

There's no need to install the .NET Data Provider

<sql:DataProvider>
 <sql:Name>System.Data.SqlClient</sql:Name>
 <sql:ConnectionString>
  server=szcda41.m.domain.eu;
  database=nameofdb;
  User Id={0};Password={1}
 </sql:ConnectionString>
 <!-- login can be in form domain.name\some.login -->
 <!-- to use windows credentials replace Id=..;Password=.. with: security=SSPI -->
</sql:DataProvider>

PostgreSQL

Please install npgsql driver from the 4.0.x branch (4.0.7 is known to work) and enable GAC installation.  GAC installation is needed for the Macrobond application, newer versions of the npgsql do not provide such an option.

<sql:DataProvider>
 <sql:Name>Npgsql</sql:Name>
 <sql:ConnectionString>
  Database=nameofdb; Host=database.server.address; Port=5432;
  Username={0};Password={1};
 </sql:ConnectionString>
</sql:DataProvider>

Other databases

Please also see examples listed at https://www.dofactory.com/connection-strings 

Note, MS Access is not supported.

SQL demo Virtual Machine – instructions for managers

What you need to send to the client?

  1. A zipped file containing 3 xml configuration files, each for a different database view.
  2. Login and password to the database in a secret link, which you can generate from link: https://secret.macrobond.com/new

IMPORTANT

Client needs to install below .NET driver for MySQL/MariaDB, it will allow MB main-app to 'talk' to those databases:

https://dev.mysql.com/get/Downloads/Connector-Net/mysql-connector-net-8.0.28.msi

How client can connect to our database?

! This is the link that you want to send to the client !

Here you will find instruction how to connect: https://help.macrobond.com/technical-information/the-macrobond-sql-database-connector/sql-demo/#connect 

SQL demo Virtual Machine

Introduction

You can test our SQL database connector functionality with three views of test SQL database.

Requirements

Our example utilizes the MySQL and requires .NET connector for MySQL. This will allow Macrobond to 'talk' to SQL database.

You will also need xml path file which you will receive from your Account Manager. We have prepared three different views of the database and there are three different xml files.

How to connect?

With these steps you will connect to our live SQL database.

  1. Download and install .NET connector from here: https://dev.mysql.com/get/Downloads/Connector-Net/mysql-connector-net-8.0.28.msi
  2. Download xml file(s) you have received.
  3. Go to Configuration > Settings > My series (for MB pre-1.28: Edit > Settings > My series), new window will appear
    1. select 'Add' > 'SQL database'
    2. enter information as in the screenshot:
      Name: database1
      Prefix: dba
      Configuration file path: navigate to one of the three xml files you have received 
      Username & password: paste in the one you have received Note that each xml contains path to a different view of database. To connect your Macrobond to all three please use 'Add' > 'SQL database' for each of them.
    3. Press 'OK'.

Accessing data

You can access this data through Browse or Analytics. Please go to drop-down menu and select newly added database.

Examples show both Static and Dynamic trees. The main difference between the two is that Dynamic uses specific columns or rows in your database to automatically generate nodes in the tree.

For more information see The Macrobond SQL Database Connector and SQL In-house – examples.

The Macrobond FTP/SFTP data feed

Introduction

The Macrobond FTP/SFTP data feed allows automated integration of our economic and financial data with your in-house backend systems. Below we describe basic feed concepts and the various types of feed service we offer.

The Macrobond FTP/SFTP data feed is a separately licensed solution.

Working with FTP data feed

Data Delivery

For maximum flexibility, updates are provided in XML format. This allows for easy integration with your data processing system(s). Updates are provided on Macrobond servers and are reachable via FTP, SFTP or HTTPS protocols. Update documents are generated every time new data is integrated into the Macrobond database, or when existing data is revised.

You will receive a dump of the full history once a week, covering the entire set of subscriptions.

You will receive list of all the subscribed series with information about their status every 24h.

Data Set Selection

Depending on your requirements, you can declare the name of the time series you wish to retrieve within the Macrobond’s data feed online portal.

Content

Content of the Data Feed Folder

All feed files are stored in the root folder of your FTP account.

Update documents

Update documents are generated every time new data is integrated into the Macrobond database and are named using the following standard: {number}.xml e.g.:

-rw-r--r-- 1 1000 1000 29488 Dec 16 11:06 1.xml
-rw-r--r-- 1 1000 1000 5145 Dec 20 17:01 10.xml
-rw-r--r-- 1 1000 1000 29508 Dec 21 11:04 11.xml
-rw-r--r-- 1 1000 1000 5159 Dec 21 17:00 12.xml
-rw-r--r-- 1 1000 1000 29514 Dec 22 11:05 13.xml
-rw-r--r-- 1 1000 1000 5173 Dec 22 13:58 14.xml
-rw-r--r-- 1 1000 1000 5173 Dec 22 17:00 15.xml
-rw-r--r-- 1 1000 1000 29521 Dec 23 11:04 16.xml
-rw-r--r-- 1 1000 1000 5187 Dec 23 17:00 17.xml
-rw-r--r-- 1 1000 1000 5201 Dec 24 08:01 18.xml
-rw-r--r-- 1 1000 1000 5201 Dec 24 17:00 19.xml
-rw-r--r-- 1 1000 1000 5118 Dec 16 17:00 2.xml
-rw-r--r-- 1 1000 1000 29529 Dec 27 11:07 20.xml
-rw-r--r-- 1 1000 1000 5215 Dec 27 17:00 21.xml
-rw-r--r-- 1 1000 1000 29536 Dec 28 11:05 22.xml
-rw-r--r-- 1 1000 1000 5229 Dec 28 17:01 23.xml
-rw-r--r-- 1 1000 1000 29543 Dec 29 11:06 24.xml
-rw-r--r-- 1 1000 1000 29495 Dec 17 11:07 3.xml
-rw-r--r-- 1 1000 1000 5131 Dec 17 17:01 4.xml
-rw-r--r-- 1 1000 1000 5145 Dec 20 10:32 5.xml
-rw-r--r-- 1 1000 1000 5145 Dec 20 10:47 6.xml
-rw-r--r-- 1 1000 1000 29502 Dec 20 11:07 7.xml
-rw-r--r-- 1 1000 1000 5145 Dec 20 11:21 8.xml
-rw-r--r-- 1 1000 1000 5145 Dec 20 17:00 9.xml

Data contained in these files should be integrated into your database systems following the natural order {number} – 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 etc.

Full data dump

A full dump of documents is generated once a week using the following naming convention: full_dump_{date}_{time}s_n_of_m.xml.gz. For instance:

-rw-r--r-- 1 feed feed 14564466 Dec 2 03:10 full_dump_20211202_031014s_1_of_8.xml.gz
-rw-r--r-- 1 feed feed 13835432 Dec 2 03:10 full_dump_20211202_031014s_2_of_8.xml.gz
-rw-r--r-- 1 feed feed 13669402 Dec 2 03:10 full_dump_20211202_031014s_3_of_8.xml.gz
-rw-r--r-- 1 feed feed 14003385 Dec 2 03:10 full_dump_20211202_031014s_4_of_8.xml.gz
-rw-r--r-- 1 feed feed 13446979 Dec 2 03:10 full_dump_20211202_031014s_5_of_8.xml.gz
-rw-r--r-- 1 feed feed 14194428 Dec 2 03:10 full_dump_20211202_031014s_6_of_8.xml.gz
-rw-r--r-- 1 feed feed 14071453 Dec 2 03:10 full_dump_20211202_031014s_7_of_8.xml.gz
-rw-r--r-- 1 feed feed 14222872 Dec 2 03:10 full_dump_20211202_031014s_8_of_8.xml.gz

Currently, as above, 8 files form a full dump. However, it is possible that in the future this number may change.

Files that are more than 6 days old are automatically removed from the client’s data feed folder.

Content of the Data Feed Files

Update documents are generated each time when series to which feed clients subscribe are revised or extended.

Full data dumps - generated weekly, even when the database has not been updated - use the same xml format as the update documents.

Format

The XSD file defining the xml format produced by the feed can be found at: http://www.macrobond.com/schemas/Feed.xsd

When an xml file contains series, they will be presented in the following format:

<result>
  <feed_entry document_type="full" primname="some_code0">
    <entity primname="some_code0" type="TimeSeries" state="0" changed_observations="1" changed_meta="0">
      <meta_data>
        <!-- meta-data part -->
      </meta_data>
      <observation_data>
        <!-- series observation part -->
      </observation_data >
  </entity>
</feed_entry>
<feed_entry document_type="full" primname="some_code1" changed_observations="1" changed_meta="1">
  <entity primname="some_code1" type="TimeSeries" state="0">
    <meta_data>
       <!-- meta-data part -->
    </meta_data>
    <observation_data>
       <!-- series observation part -->
    </observation_data>
   </entity>
 </feed_entry>
</result>

Each time series provided in the xml document is enclosed in the feed_entry tag and should be identified by primname attribute.

Each time series document, enclosed in an entity tag, contains two sections:

  • meta_data – metadata section containing attributes describing time series, information about start date, frequency and day-mask.
  • observation_data – containing values of the time series, separated by semicolon.

Series Metadata

Start date information is presented as: <attr dt="datetime" attr_id="9" name="StartDate" freetext="1">1971-01-04</attr>

· Frequency information is presented as: <attr dt="string" attr_id="4" name="Frequency" short_name="daily" value_id="4482">Daily</attr>

Possible values of the frequency field:

  • Annual
  • BiMonthly
  • Daily
  • Monthly
  • QuadMonthly
  • Quarterly
  • SemiAnnual
  • Weekly

Day Mask

Daily time series also contain day mask information indicating which day of the week is described by the given set of observations. The day mask is expressed as a decimal representation of binary numbers with the least significant bit – corresponding to Monday. The Monday-to-Friday working week is described as 31 [binary: 0001 1111 ].

Sample Day mask information: <attr dt="int32" attr_id="3" name="DayMask" freetext="1">31</attr>

Free-text attribute Scale tells order of magnitude for the values provided in the series observations section. For instance:

<attr freetext="1" name="Scale" attr_id="8" dt="int32">0</attr>: indicates that all values provided in the series observations should be multiplied by 100=1.

<attr freetext="1" name="Scale" attr_id="8" dt="int32">-3</attr>: observations should be multiplied by 10-3=0.001.

Please note that some of the attributes might have more than one value:

<attr dt="string" attr_id="66" name="Region" short_name="de" value_id="337">Germany</attr>

<attr dt="string" attr_id="66" name="Region" short_name="state_de_badwur" value_id="580833">Baden- Württemberg</attr>

Series Observations

Series observations are expressed as repeated xml tags containing date and observation value pairs. For instance:

<observation_data>
<observation date="1971-01-04">5.1643</observation>
<observation date="1971-01-05">5.1628</observation>
<observation date="1971-01-06">5.1614</observation>
<observation date="1971-01-07">5.1649</observation>
<observation date="1971-01-08">5.1631</observation>
</observation_data>

The Observations section contains only values for dates valid based on the StartDate/Frequency/DayMask combination.

Missing observations

It is possible that for some of the dates Macrobond will be missing observations. In this case the value of the observation tag will contain:

  • String NaN- if it is a missing value
  • The letter s- if it is known that there is no data for a given date (because of a holiday)

For instance:

<entity primname="sek" type="TimeSeries" state="0" changed_observations="1" changed_meta="0">
  <observation_data>
    <observation date="1971-01-04">5.1643</observation>
    <observation date="1971-01-05">s</observation> <!-- there was no data/trading available for this date -->
    <observation date="1971-01-06">5.1614</observation>
    <observation date="1971-01-07">NaN</observation> <!-- macrobond has missed data update for that date -->
    <observation date="1971-01-08">NaN</observation> <!-- macrobond has missed data update for that date -->
    <observation date="1971-01-11">5.1642</observation>
  </observation_data>
  <meta_data>
    <attr dt="string" attr_id="4" name="Frequency" short_name="daily" value_id="4482">Daily</attr>
    <attr dt="datetime" attr_id="9" name="StartDate" freetext="1">1971-01-04</attr>
    <!-- other attributes -->
  <meta_data>
</entity>

Replaced Series – Aliases

It is possible that some of the series you are subscribed to will be removed from the Macrobond database and replaced with another. The replacement series contain a list, separated by commas, of the substituted series codes in the attribute called Aliases.

For instance – when the series tovvg47swdaea19tu was removed from the Macrobond database, clients who subscribed to it automatically received an update for a new series called tovvg47scaea19tu.

The new series contained:

<entity primname="tovvg47scaea19tu" type="TimeSeries" state="0" changed_observations="1" changed_meta="0">
  <observation_data>
    …
  </observation_data>
  <meta_data>
    <attr dt="string" attr_id="5" name="Aliases" freetext="1">tovvg47swdaea19tu</attr>
    …
  <meta_data>
</entity>

Generated files

Subscription List files

Subscription List files are generated every 24h – under name subscription_list_{date}_{time}.csv. The same information can also be fetched via the web management interface https://feed.macrobond.com/customer/ > Subscriptions List > Download.

Files are in CSV format, with the following columns:

  • Subscribed_code - series code as subscribed,
  • Actual_code - actual series code in Macrobond database in case subscribed_code is an alias,
  • State - current state of the series in Macrobond database, has following values:
    • 0 - series is active, receives updates,
    • 1 - series has been disabled and is not available either via data feed or the Macrobond application,
    • 4 - series has been discontinued by the source and is unlikely to receive further updates,
    • -1 - series has been deleted. If the actual_code is not empty - it’ll be provided with that code,
  • available_for_feed:
    • 0 - Macrobond is not allowed to re-distribute this series via feed
    • 1 - series is available for the feed
  • last_update_time - date and time when given series was dumped the last time, empty if it has not been dumped yet,
  • last_dump_file_num - number of the file where this series was dumped most recently, 0 if it has not been dumped yet.

Feed Statistics files

Feed Statistics files are generated every 24h – under name feed_statistics_{date}_{time}.csv. They contain column header and single line with results. Columns:

  • max_allowed_subscription_size – current upper limit of the size of subscription list,
  • currently_subscription_size – total number of series, including not-available, that are currently in the subscription list