Adding A New Data Source
User defined internet data sources may be used by TFire in a seamless manner by defining the proper functions for a user defined Type that subtypes InternetDataSource.
Example
This guide demonstrates how the user would add the (already included) YahooFinance
data source in this way
Create A New File
Create a new file YahooFinanceUser.jl
in the CustomDataSources directory. In the that file, add the following code:
struct YahooFinance <: InternetDataSource end
function available_data_fields(::Type{YahooFinance})
return DataFields(Set([:close, :adj_close, :open, :low, :high, :volume]))
end
function external_API_timeseries(::Type{YahooFinance}, ticker, from, to, requested_data_fields::DataFields; resolution="1d", skip_missing=false)
# Intraday data cannot extend past last 60 days
if !(resolution in ["1m", "2m", "5m", "15m", "30m", "60m", "90m", "1h", "1d", "5d", "1wk", "1mo", "3mo"])
e = "Yahoo Interval incorrectly specified"
return ErrorResult(e)
end
if !issubset(requested_data_fields.fields, available_data_fields(YahooFinance).fields)
e = "Requested data fields not available from Yahoo Finance"
return ErrorResult(e)
end
if resolution == "1d"
to = to + Day(1)
end
start = round(Int64, datetime2unix(DateTime(from)))
stop = round(Int64, datetime2unix(DateTime(to)))
params = Dict("period1" => start, "period2" => stop, "interval" => resolution)
url = "https://query1.finance.yahoo.com/v8/finance/chart/" * ticker
local r
local data
try
r = HTTP.get(url; query=params, retries=5)
data = JSON.parse(String(r.body))
catch e
return ErrorResult(e)
end
if !(haskey(data["chart"]["result"][1], "timestamp"))
return ErrorResult()
end
timestamps = unix2datetime.(data["chart"]["result"][1]["timestamp"])
quote_data = data["chart"]["result"][1]["indicators"]["quote"][1]
result_data = Dict{Symbol,Vector}()
result_data[:volume] = quote_data["volume"]
result_data[:open] = quote_data["open"]
result_data[:close] = quote_data["close"]
result_data[:low] = quote_data["low"]
result_data[:high] = quote_data["high"]
if resolution in ["1m", "2m", "5m", "15m", "30m", "60m", "90m", "1h"]
timestamps = timestamps + Hour(2)
result_data[:adj_close] = quote_data["close"]
result_data[:adj_open] = quote_data["open"]
result_data[:adj_low] = quote_data["low"]
result_data[:adj_high] = quote_data["high"]
else
timestamps = DateTime.(Date.(timestamps))
result_data[:adj_close] = data["chart"]["result"][1]["indicators"]["adjclose"][1]["adjclose"]
end
result_data[:Index] = timestamps
return prepare_result(result_data, requested_data_fields, skip_missing)
end
Include The File
In your custom scripts or main file, add the following line to include the file: include("CustomDataSources/YahooFinance.jl")
.
Fetching Data With The User Defined Source
The function EDH.fetch_external_data_from_internet(ext_cons, service, data_fields; clean_and_prepare=true, verbose_error=true)
with service
specified as YahooFinanceUser
can now be used to fetch data from the user defined YahooFinanceUser
source.
Explanation Of The Code
Define the Data Source
Create a struct for your data source. The struct should be a subtype of the abstract type InternetDataSource
.
struct YahooFinanceUser <: InternetDataSource end
Step 2: Declare Available Data Fields
Define available data fields by returning a DataFields
object with a list of field symbols.
function available_data_fields(::Type{YahooFinanceUser})
return DataFields([:close, :adj_close, :open, :low, :high, :volume])
end
Step 3: Implement the Data Retrieval Function
Define a function to fetch and process data from the data source. This function should accept the parameters: ticker
, from
, to
and requested data fields
. It should also specify the optional arguments resolution
and skip_missing
.
function external_API_timeseries(::Type{YahooFinanceUser}, ticker, from, to, requested_data_fields::DataFields; resolution="1d", skip_missing=false)
# Validate and prepare API call parameters
...
# API call and data parsing
...
# Data processing and formatting
...
# Return the prepared result
return prepare_result(result_data, requested_data_fields, skip_missing)
end
Here, result_data
should be a Dict with Symbols linking to Vectors of timeseries. The symbol :Index
should be the timestamp index.