Adding A New Data Source

User defined internet data sources may be used by TFire in a seamless manner by defining the proper functions for a user defined Type that subtypes InternetDataSource.

Example

This guide demonstrates how the user would add the (already included) YahooFinance data source in this way

Create A New File

Create a new file YahooFinanceUser.jl in the CustomDataSources directory. In the that file, add the following code:

struct YahooFinance <: InternetDataSource end

function available_data_fields(::Type{YahooFinance})
    return DataFields(Set([:close, :adj_close, :open, :low, :high, :volume]))
end


function external_API_timeseries(::Type{YahooFinance}, ticker, from, to, requested_data_fields::DataFields; resolution="1d", skip_missing=false)
    # Intraday data cannot extend past last 60 days
    if !(resolution in ["1m", "2m", "5m", "15m", "30m", "60m", "90m", "1h", "1d", "5d", "1wk", "1mo", "3mo"])
        e = "Yahoo Interval incorrectly specified"
        return ErrorResult(e)
    end
    if !issubset(requested_data_fields.fields, available_data_fields(YahooFinance).fields)
        e = "Requested data fields not available from Yahoo Finance"
        return ErrorResult(e)
    end
    if resolution == "1d"
        to = to + Day(1)
    end
    start = round(Int64, datetime2unix(DateTime(from)))
    stop = round(Int64, datetime2unix(DateTime(to)))
    params = Dict("period1" => start, "period2" => stop, "interval" => resolution)
    url = "https://query1.finance.yahoo.com/v8/finance/chart/" * ticker

    local r
    local data
    try
        r = HTTP.get(url; query=params, retries=5)
        data = JSON.parse(String(r.body))
    catch e
        return ErrorResult(e)
    end

    if !(haskey(data["chart"]["result"][1], "timestamp"))
        return ErrorResult()
    end
    timestamps = unix2datetime.(data["chart"]["result"][1]["timestamp"])
    quote_data = data["chart"]["result"][1]["indicators"]["quote"][1]

    result_data = Dict{Symbol,Vector}()
    result_data[:volume] = quote_data["volume"]
    result_data[:open] = quote_data["open"]
    result_data[:close] = quote_data["close"]
    result_data[:low] = quote_data["low"]
    result_data[:high] = quote_data["high"]

    if resolution in ["1m", "2m", "5m", "15m", "30m", "60m", "90m", "1h"]
        timestamps = timestamps + Hour(2)
        result_data[:adj_close] = quote_data["close"]
        result_data[:adj_open] = quote_data["open"]
        result_data[:adj_low] = quote_data["low"]
        result_data[:adj_high] = quote_data["high"]
    else
        timestamps = DateTime.(Date.(timestamps))
        result_data[:adj_close] = data["chart"]["result"][1]["indicators"]["adjclose"][1]["adjclose"]
    end
    result_data[:Index] = timestamps
    return prepare_result(result_data, requested_data_fields, skip_missing)
end


Include The File

In your custom scripts or main file, add the following line to include the file: include("CustomDataSources/YahooFinance.jl").

Fetching Data With The User Defined Source

The function EDH.fetch_external_data_from_internet(ext_cons, service, data_fields; clean_and_prepare=true, verbose_error=true)

with service specified as YahooFinanceUser can now be used to fetch data from the user defined YahooFinanceUser source.

Explanation Of The Code

Define the Data Source

Create a struct for your data source. The struct should be a subtype of the abstract type InternetDataSource.

struct YahooFinanceUser <: InternetDataSource end

Step 2: Declare Available Data Fields

Define available data fields by returning a DataFields object with a list of field symbols.

function available_data_fields(::Type{YahooFinanceUser})
    return DataFields([:close, :adj_close, :open, :low, :high, :volume])
end

Step 3: Implement the Data Retrieval Function

Define a function to fetch and process data from the data source. This function should accept the parameters: ticker, from, to and requested data fields. It should also specify the optional arguments resolution and skip_missing.

function external_API_timeseries(::Type{YahooFinanceUser}, ticker, from, to, requested_data_fields::DataFields; resolution="1d", skip_missing=false)
    # Validate and prepare API call parameters
    ...
    # API call and data parsing
    ...
    # Data processing and formatting
    ...
    # Return the prepared result
    return prepare_result(result_data, requested_data_fields, skip_missing)
end

Here, result_data should be a Dict with Symbols linking to Vectors of timeseries. The symbol :Index should be the timestamp index.