开云体育

ctrl + shift + ? for shortcuts
© 2025 Groups.io

Historical Data Download : Compare Experiences


 

I, like many of you I am sure, have created a data downloader. I came across a problem recently with my code which has been running well for ages (years). The difference this time is I am trying to download quite old data. Let me explain.

1) I have no problem to download the last 10 years of daily S&P500 stock (503 stocks currently) price data. Works fine, no hang-ups except for a couple of stocks (VICI & LIN) which IB have admitted to a problem and need special handling. (Note: I always take the minimum period between 10 years or the head timestamp)

2) Recently I am trying to run some data analysis on some longer history, so I set my "END DATE" to 20141231-115959 US/EASTERN and started a similar download to #1 above asking for 10 years history. This time I get to AXP which happens to be my 98th API call and TWS returns nothing. No error in the API log, just nothing. I removed AXP and tried again but the same thing happened with the 98th stock (this time AZO). After a 2min time-out waiting for data or a response, the next API call to TWS gets no reply again (this time happens to be a reqHeadTimestamp() call). If I move AXP to the head of my list, the AXP data downloads fine so it is not a data issue with that stock.

My code is very conservative and I download data in serial so there are no concurrent calls to reqMktData() however I suspect this is a form of pacing, but is it normal to get no response? I literally have to disconnect and reconnect my app to get things moving again.

Has anyone seen a problem on older data vs newer data like this?

Perhaps rather than trying to do this in 2 steps I could just ask for 20 years daily history.

Can anyone offer me pointers to getting older data out of TWS?


 

开云体育

My downloader works rather differently, in that it uses a lot of asynchrony to get the data as fast as possible.

?

I’ve just run it for daily S&P500 bars from 01/01/2005 to 01/0/2015. It went through the whole list, taking 66 minutes in total.

?

It doesn’t do anything differently for older data. For each symbol (read from stdin) it does a contract details request (just to make sure the symbol is valid: my S&P500 list is a bit out of date, with about 30 or so incorrect symbols so far), followed by a historical data request, and then writes the returned data out to a text file. Multiple contract details requests and historical data requests are made concurrently, up to 20 of them for the contracts. As each contract request completes, the corresponding historical data request is added to a queue and the next contract request is made. As each data request completes, the next one in the queue is fired.

?

So it appears that TWS itself doesn’t prevent old data downloads running smoothly.

?

I don’t know what to suggest regarding your issue, though it does look like it’s likely to be a fault in your code. The fact that it hangs after 98 API requests looks very suspicious. But then why wouldn’t it do the same with more recent data?...

?

Using single requests for 20 years should work fine, but if you’re anything like me you’ll probably want to track down the cause of the problem.

?

Richard

?

?

From: [email protected] <[email protected]> On Behalf Of David Armour
Sent: 21 February 2023 02:06
To: [email protected]
Subject: [TWS API] Historical Data Download : Compare Experiences

?

I, like many of you I am sure, have created a data downloader. I came across a problem recently with my code which has been running well for ages (years). The difference this time is I am trying to download quite old data. Let me explain.

1) I have no problem to download the last 10 years of daily S&P500 stock (503 stocks currently) price data. Works fine, no hang-ups except for a couple of stocks (VICI & LIN) which IB have admitted to a problem and need special handling. (Note: I always take the minimum period between 10 years or the head timestamp)

2) Recently I am trying to run some data analysis on some longer history, so I set my "END DATE" to 20141231-115959 US/EASTERN and started a similar download to #1 above asking for 10 years history. This time I get to AXP which happens to be my 98th API call and TWS returns nothing. No error in the API log, just nothing. I removed AXP and tried again but the same thing happened with the 98th stock (this time AZO). After a 2min time-out waiting for data or a response, the next API call to TWS gets no reply again (this time happens to be a reqHeadTimestamp() call). If I move AXP to the head of my list, the AXP data downloads fine so it is not a data issue with that stock.

My code is very conservative and I download data in serial so there are no concurrent calls to reqMktData() however I suspect this is a form of pacing, but is it normal to get no response? I literally have to disconnect and reconnect my app to get things moving again.

Has anyone seen a problem on older data vs newer data like this?

Perhaps rather than trying to do this in 2 steps I could just ask for 20 years daily history.

Can anyone offer me pointers to getting older data out of TWS?


 

Thanks for your input Richard.

I did some testing myself today as well. I modified my downloader to get up to 25 years history. It took 1hr 50mins to get 25 years for the S&P500 stocks. (Of course not all stocks have 25 years history, but it worked.)

Interestingly I "found" a new option (new for me) in TWS API settings which I set on and it has helped make my code run far more smoothly. I realised I was not getting any errors reported for pacing issues. Setting this below helps with prompt error messaging.



It also helped me discover the problem with AXP. I have not been setting any PrimaryExchange when pulling historical data and for some reason TWS accepts this for recent history, but for older history it seems to keel over and hang up. (As I mentioned, the API Log shows my request going to TWS correctly but nothing comes back. I then saw something strange. I send a cancelHistoricalData() when I get a 2 minute timeout and I see that I am getting an error message saying that there is no such query existing which seems strange. Does this mean that TWS did not even send my reqHistoricalData() to IB? Does TWS actually filter requests itself?)

Anyway, I manually added the PrimaryExchange for AXP (I have an option to put PrimaryExchange after symbol in list like AXP@NYSE) and it worked fine.

Richard, this brings me to a question for you. You call contractDetails() beforehand to check the contract which I think is smart. However, do you have a methodology to select the correct contract from the list that gets returned? Could you share your logic with me? Once you get the contract, do you specify all fields in the contract for reqHistoricalData() call (e.g. PrimaryExchange, etc.) or leave some things blank?


Rgds,
David

P.S. I am exactly like you. I want to fix this.... :)


 

开云体育

David

?

The question you ask is not easy to answer succinctly, but I’ll give it a go. The description I gave earlier is very much simplified, but it captures the main sequence of events.

?

The first thing to say is that my free-standing downloader program uses the contract data and historical data facilities in my trading platform, which are also used for other purposes: eg fetching data for a chart, priming an automated trading strategy with data to initialise indicators, etc. So these mechanisms are completely reusable by any application that needs them. They are very sophisticated: an application can ask for bars of any size, over any duration, or any number: for example 17-minute bars from 01/01/2023 until now; constant volume and constant range bars are also supported.

?

The bulk of the logic in the downloader program itself is to do with the processing of command-line inputs, issuing the requests, and writing the results to files (or stdout).

?

The contract data facilities allow to fetch a single contract or multiple contracts. If a contract specifier that is not uniquely specified (for example MSFT@SMART) is given to a request for a single contract, an error is returned. The downloader program always requests single contracts, so the contract specifier supplied in the relevant command line input must specify a unique contract for the request to succeed.

?

So for example, MSFT@SMART will give an error because MSFT can be traded via several SMART routing infrastructures. To get the US one, you’d have to use MSFT@SMARTUS; for the London one, you’d use MSFT@SMARTUK. (Note that “SMARTUS”, “SMARTUK” etc are terms defined by the platform that cause the appropriate primary exchange to be included in requests for SMART routing: this means that users don’t need to understand what ‘primary exchange’ is all about.)

?

The results of each TWS API contract details request are cached by the contract data facilities. Whenever a TWS API contract details request is to be made, the cache is checked first to see if the data is already available. If it is, the cached contract data can be used directly. For example for the historical data scenario, the contract fetcher will first check the cache to see if the requested contract is already available: if so, the cached contract is used directly in the historical data API request; if not, a contract details request is made to the API and the returned results are cached and then used in the historical data request. So the short answer to the question is that all contract fields are used in the historical data request, including primary exchange.

?

I don’t know whether this rather convoluted description (which still leaves out a ton of detail) is of any help or interest to you, but while checking out some of the information I noticed a couple of things that need attention, so it’s been worthwhile for me!

?

Richard


 

Richard,

Thanks for the reply.

If I am reading your reply correctly, your code requires/expects a unique contract to successfully download data. My issue is slightly different. I am trying to identify the unique contract from the symbol, knowing that the stock trades in the US. The reason I do it this way is that I am scraping websites for stock lists like Russell 2000, etc., and I do not want to manually define the contract for thousands of stocks.

The problem I have is that occasionally, as you point out, using SMART without setting the PrimaryExchange is not unique and fails. I have created some workarounds for this in my code but it is clunky and I was hoping there might be a "correct" way to do it.

BTW - I would be interested to know if you are able to download data for the symbol VICI which is in the S&P500. No matter what I do I get an error telling me:?? "No historical market data for VICI/STK@VALUE Last 0"

This happens even when specifying the PrimaryExchange as NYSE. It also seems impossible to get a Head Timestamp for this stock. I have an open ticket with IBKR for a month now without any response from the developers, other than them acknowledging the problem.

Rgds, David


 

开云体育

David

?

First, regarding VICI, I find it returns data up to 5 years ago, but gives an error for anything older, as you discovered. No idea why. There are odd quirks like this here and there and it never seems IB are interested in fixing them.

?

Yes, my downloader expects a unique contract, but any US stock symbol with Exchange=”SMART” and PrimaryExchange=”NYSE” will give you a unique contract. All the PrimaryExchange does is disambiguate SMART for contracts that are traded in more than one SMART scope.

?

For example, MSFT trades in four SMART scopes (which you can see with the ContractInspector): Europe (with currency GBP), Canada, Europe (with currency CHF), and US. So PrimaryExchange=”EBS” will give you the European one with currency CHF.

?

Generally speaking, you don’t actually need to specify the primary exchange at all with SMART, as just supplying the currency is enough to disambiguate. Oddly, though, the only way to get the MSFT SMART contract with currency CAN is to provide PrimaryExchange=”AEQLIT”: using the CAN currency here doesn’t do it – and what’s even odder is that AEQLIT is not even returned as a valid exchange for this contract.

?

So I would suggest that if you scrape a symbol that you suspect is a US stock, just try Exchange=”SMART” and PrimaryExchange=”NYSE”. It may be that there are some other quirks that this won’t work for, but that would at least give you something to investigate. And by the way you can alternatively use “NASDAQ”, “ARCA”, “BATS”, etc for PrimaryExchange with the same results – they all just tell IB that you’re talking about the US MSFT contract.

?

I hope this helps…

?

[PS: by the way I could easily enhance the downloader so that it would download data for all contracts in the case where a non-unique contract is specified, so that in the MSFT case above I’d get the data for all four SMART contracts (ignoring data subscription issues). But I can’t think of any good reason for it to work like this.]

?


 

David

Not sure it answer your request, seems you look for 10 years,.
In my DB headStamp is 1508198412? this is October 17, 2017 12:00:12 AM Don't ask me where it come from, probably not IBKR,?
Wikipedia says "Vici completed its IPO on the??in February 2018"

Using: "VICI","SMART","CONID" (292080616) "TRADES"?
I was able to get 30 day-bars:? HistoricalDataEnd - StartDate: 20180226 12:00:00(1519646400), EndDate: 20180328 12:00:00(1522238400)
However seems HistoricalDataEnd? doesn't seems to reflect that first bar I really got was 20180214.
Seems IBKR have data prior to 20180214 but it complains that I need to ask it on PINK, (while using CONID/SMART) I interpret that as no NYSE data prior to 20180214
Can't test it now.


 

I modified my auto downloading code to look for the Head Timestamp to prevent requesting data before that but I have found that it creates more problems than it solves.

reqHeadTimestamp() suffers from pacing violations which do not conform to the standard 50 messages per second rule. The problem is that when the reqHeadTimestamp() faces a pacing error it hangs and doesn't recover and it doesn't report any error messages (unless you tick that option in the API settings to reject messages instantly). There seems to be some internal code problem in the API on IBKR side but trying to explain this to them will be a nightmare I am not willing to pursue.

It is true that if you load VICI data on any period within its life in the SP500 it works without problem. There are other symbols that suffer similar problems when trying to get very long history:? AAL, LIN, MNST. Those 4 fail when trying to get 25 years of daily history. Every other stock in the SP500 will work when asking 25 years of daily history with Richard's proposal below of using SMART + NYSE.? (In my code I am actually calling reqContractDetails() with the symbol, currency, sec type & exchange set (e.g. MSFT, USD, STK, SMART) and using the contract returned in my call to reqHistoricalData() which also works fine.)

It seems clear, as also mentioned by Richard, that IB do not care much about fixing their data problems, so we just have to live with them. As painful as it is to my Engineering core, I will write exceptions into my code so that it works around these few stock failures automatically.


 

开云体育

If you ask, using SMART, for duration of 1 D, bar size 1 D, end date 20180110, you get one bar back, for 201810109:

?

date=20180109-00:00:00 open=20.2 high=23.99 low=20 close=20.21 volume=2245 barCount=19 WAP=20.036965

?

Any attempt to get data earlier than this fails, so I presume this is the oldest date that IB have any data for.


 

> The problem is that when the reqHeadTimestamp() faces a pacing error it hangs and doesn't recover and it doesn't report any error messages
Thank you for posting this, some of my code to download data started hanging a few months ago. I ended up setting alarms and using signal handlers, but it was a pain to deal with and I never bothered tracking down the cause. I may remove the reqHeadTimestamp() or run them in a separate process and cache the results.


 

Shared from my desk:
I am unable to catch the rule used by IBKR for pacing reqHeadTimestamp. But there is one!
But I got the feeling that exhaustive scrutation is disliked by IBKR and a batch does worst than interwoven request.I rarely have pacing violation with reqHeadTimestamp, It simply take forever to answer (and for some combination it simply never answer, like VICI/NYSE/CONID)

I "feel" pacing rule to be partly handled in TWS or GW, as it seems that disconnect/reconnect will improve your chance to retry successfully sooner.
It seems also related to reqHistorical, and all alike data activities, the more data requests you do, the less chance that reqHeadTimestamp run trough. Somewhere that make sense.

I feel an unsound practice to call reqHeadTimestamp at each reqHistorical.
With the caveat that there seems to be a multiplicity of headstamp, they differ for each exchange and also with the Whatoshow.
I default to "SMART" "Trades" to get a worst case situation but doesn't mean IBKR doesn't have more data available on a specific case.

I do caching of it, general update once a month. Seems enough.
Batch size of 150. The first 30 generally come fast, the next 70 are painfully slow, and sometime I am lucky enough to get the last 50.
With a cool down period of 2 hours, take about 10 days to update all US. (may it can be done at faster pace, was fast enough for me and no time to dig more)
I never do parallel calls on reqHeadTimestamp. This seems worst than waiting each answer to trigger next request if you want to improve your chance to get more than 30 headstamp quickly.

?


 

开云体育

Contrary to what you say, reqHeadTimestamp for VICI does not ‘never answer’: it returns an error 162:

?

“Historical Market Data Service error message:No historical market data for VICI/STK@VALUE Last 0”

?

Personally, I never use reqHeadTimestamp at all: I see no value in it.

?

?

Richard


 

The error reported when making a reqHeadTimestamp() to VICI (or the other 3 stocks I mentioned from the SP500 - I am sure there are others as well) is interesting and gives some insight to the way the API command works internally. Clearly the code calls its own 'reqHistoricalData()-like' function to find the earliest date which then fails in exactly the same way that a call to reqHistoricalData() does with a period longer than available data. I suspect this internal call is the source of the eventual pacing problem that never responds.

Try calling reqHeadTimestamp() on 500+ symbols in a row with the 20ms gap (i.e. 50 API calls per second) and your code will eventually hang with no error reported.

I agree with Richard that a call to reqHeadTimestamp() is useless since usually setting a duration that goes before the head timestamp will not cause any error. In the case it does, reqHeadTImestamp() doesn't help either due to the internal way it calls its own reqHistoricalData().

In case anyone is interested in the workaround I implemented, for those 4 stocks (AAL, LIN, MNST, VICI) I used the TWS Daily Chart to go back to the earliest date I could get and used this as my reference point for an exception table in my database. As long as you do not ask for data before that point in time then there will be no error in the call to reqHistoricalData(). It is interesting to note that the same error will appear on the top left of the TWS Chart if you try to get monthly bars that go back before this reference date so there is a clear bug in their data which goes beyond just the API.

Thanks for everyone's input on this. I have finally found a way to automatically load a long history (25 years tested successfully) of any list of US stocks without failure.