开云体育

ctrl + shift + ? for shortcuts
© 2025 开云体育

record and replay ib_async session (data only)?


 

?
How hard would it be to modify ib_async to record a session for playing back later? ?The motivation is that it would give a general mechanism for recording data for later use, such as backtesting. ?
?
Specifically, I would like to have a client that runs ib_async (suitably modified) for a day, subscribing to various data feeds such as ticks and level2 data (but not making any trades or other active requests that modify account state). ?I would like the modified ib_async to record all incoming messages from TWS in a file, so that later I could play the session back. ?That is, run a different client, with ib_async simulating the incoming TWS messages recorded in the file, instead of actually connecting to TWS.
?
How hard would it be to modify ib_async to allow this? ?It seems like it _might_ be relatively easy, mainly by modifying the connection class.
?
More generally, one could imagine recording a session that also included activities such as trading, and being able to replay those verbatim, with some kind of inspector that allowed one to zoom in and inspect critical moments in a session.
?
Or perhaps someone has already tried this?
?
-Neal


 

What do you gain from this approach?

If I want to backtest something, I only care about the data and my algo code; ib_async is just an interface to the brokerage and doesn’t make my algo work or not work. ?I could use ib_async to request historical data that I could save and later backtest using some replay mechanism built into the back tester.

If you are successful in recording and playing back a ‘session’, I would see you gain the ability to test whether there are bugs in ib_async or TWS/GW. ? Either of which I don’t think has anything to do with (back)testing the efficacy of your algo, does it?

On Wed, Jan 15, 2025 at 6:58 AM, Neal Young via groups.io <youngneal=gmail.com_at_groups_io_k228j4f9pxfcc8_496s0895@...> wrote:

?
How hard would it be to modify ib_async to record a session for playing back later? ?The motivation is that it would give a general mechanism for recording data for later use, such as backtesting. ?
?
Specifically, I would like to have a client that runs ib_async (suitably modified) for a day, subscribing to various data feeds such as ticks and level2 data (but not making any trades or other active requests that modify account state). ?I would like the modified ib_async to record all incoming messages from TWS in a file, so that later I could play the session back. ?That is, run a different client, with ib_async simulating the incoming TWS messages recorded in the file, instead of actually connecting to TWS.
?
How hard would it be to modify ib_async to allow this? ?It seems like it _might_ be relatively easy, mainly by modifying the connection class.
?
More generally, one could imagine recording a session that also included activities such as trading, and being able to replay those verbatim, with some kind of inspector that allowed one to zoom in and inspect critical moments in a session.
?
Or perhaps someone has already tried this?
?
-Neal


 

?
What do you gain from this approach?
?
Mainly, as far as I can tell (per ), for some data historical data is not available. ?E.g. my algo uses , and I don't know how to request historical level2 data.
?


 

I don’t know either, but since you can request the real time level 2 data via reqMktDepth, if you were to request it and subsequently store it, would that give you what you need?

On Wed, Jan 15, 2025 at 8:41 AM, Neal Young via groups.io <youngneal=gmail.com_at_groups_io_k228j4f9pxfcc8_496s0895@...> wrote:

?
What do you gain from this approach?
?
Mainly, as far as I can tell (per ), for some data historical data is not available. ?E.g. my algo uses , and I don't know how to request historical level2 data.
?


 

?
Jason,
?
Yes. ?Also I think that what I described may be the easiest way to do that.
?
One reason is that to store level 2 data efficiently you need some kind of diff-based approach (as level 2 levels are built and change incrementally over time). ?
?
This is how level 2 data is transmitted from TWS, and ib_async is already built to maintain the level 2 levels over time from the incremental update from TWS.
?
I should add that I was easily able to overload the Connection class in connection.py to store all messages sent and received from TWS. ? I will make a separate reply summarizing that code so far.
?
-Neal


 

?
For the record, something like the following is sufficient to record a session, that is, to record the messages passed between TWS and ib_async, presumably for later "playback", in a pickle file.
?
To implement playback via ib_async will surely take more thought. ? Ib_async would be modified so that, instead of connecting to TWS, the Connection class opens the saved pickle file, then starts playing back the saved messages from TWS to ib_async. ? The application that is using this modified ib_async would have to be built specifically to consume the data that ib_async generates from these messages, without making calls that cause ib_async to send messages to TWS . ? What I haven't yet looked into is the extent to which ib_async itself makes its own calls to TWS, and the extent to which the playback would have to synchronize itself around any such calls.
?
-Neal
?
# """ wrap connection.Connection to save TWS <-> ib_async messages, for later playback
# """

import atexit
import datetime
import inspect
import logging
import pickle
from functools import wraps
from typing import BinaryIO

import ib_async
from ib_async.connection import Connection

logger = logging.getLogger(__name__)


def _now():
return datetime.datetime.now().astimezone()


# method decorator

def
_pickle(method):

_dump_name = method.__name__

if inspect.iscoroutinefunction(method):

@wraps(method)
async def wrapper(self, *args):
pickle.dump((_now(), _dump_name, *args), self._pickle_file)
return await method(self, *args)

else:

@wraps(method)
def wrapper(self, *args):
pickle.dump((_now(), _dump_name, *args), self._pickle_file)
return method(self, *args)

return wrapper


class PickleConnection(Connection):
_pickle_file: BinaryIO

def __init__(self):
self._open_pickle_file()
super().__init__()

def _open_pickle_file(self) -> None:
timestamp = _now().strftime("%Y_%m_%d_%H_%M_%S")
filename = f"connection_{timestamp}.pkl"
atexit.register(self._close_pickle_file)
self._pickle_file = open(filename, "wb+")

def _close_pickle_file(self):
if not self._pickle_file.closed:
self._pickle_file.close()

@_pickle
async def connectAsync(self, host: str, port: int) -> None:
await super().connectAsync(host, port)

@_pickle
def disconnect(self) -> None:
super().disconnect()

@_pickle
def sendMsg(self, msg: bytes) -> None:
super().sendMsg(msg)

@_pickle
def connection_lost(self, exc) -> None:
super().connection_lost(exc)
self._pickle_file.flush()

@_pickle
def data_received(self, data: bytes) -> None:
super().data_received(data)

# patch ib_async to use PickleConnection instead of Connection

ib_async.connection.Connection = PickleConnection
ib_async.client.Connection = PickleConnection

# SCRIPT TO DUMP PKL FILE

# # !/usr/bin/env python3
#
# import datetime
# import sys
# import pickle
#
# filename = sys.argv[1]
#
# time: datetime.datetime
# method_name: str
#
# with open(filename, "rb") as fr:
#
# try:
# while True:
# time, method_name, *args = pickle.load(fr)
# timestamp = time.strftime("%H:%M:%S:%f")[:-1]
# match method_name:
# case "data_received" | "sendMsg":
# assert len(args) == 1
# args = f"{len(args[0])}b"
# case _:
# args = ", ".join(map(str, args))
# print(f"{timestamp}: {method_name}({args})")
# except EOFError:
# pass


 

?
After poking around I now believe that it would be fairly hard to implement a mechanism for recording and replaying an ib_async session for the purposes of capturing data.
?
For anybody who understands reasonably well how ib_async works (unlike me, apparently :-), that's probably pretty obvious. ?The basic reason is that the IB.Client and IB.Wrapper classes (which, respectively send messages to TWS and receive messages from TWS) are not so easily decoupled. ?I'd be happy to expand on this if anyone is actually interested in the details..
?
-Neal
?
?


 

Hey Neal,?
?
If you're interested, I have a script that scrapes and saves Level 2 DOM data to a text file. The script's basic layout is where Level 2 DOM ticker is captured to a globally accessible dataframe via the TickerUpdate event. The script's main body runs on a 15 second loop, where every 15 seconds the dataframe is copied and then exported. I copy the dataframe before exporting as I found that exporting it whilst receiving incoming tickers sometimes resulted in access issues to the dataframe. I should mention that the df kept expanding throughout the day and trimming it periodically also resulted in access issues. I asked ChatGPT for some code to overcome the struggles and it provided me a custom data class that essentially creates a dataframe that trims itself whenever records are added to it. Any records older than 60 seconds are trimmed. Setting the object up as a custom data class enabled the self-trimming behavior without suffering from the access issues.?
?
Now that I'm thinking about it, perhaps I could stop copying the dataframe as well in the main loop. I'll probably leave it as it. I find Python can be real finicky and delicate at times. Anywhoozle, the script has some extra things in it too that I would have to clean up, but if you're interested i'll do that and post what i got.?


 

Some guy 555,
?
In my case I want to be able to "replay" all the level2 history, not periodic snapshots. ?This itself won't be too hard to do by capturing the deltas sent by TWS. ?I expect the main coding issue will be integrating whatever I do it into my existing code (which I would have to do even with your script). ?So probably better for me to just roll my own. ?If I encounter unexpected obstacles, I may come back and ask to see your code. ?Thank you,
?
-Neal
?


 

Hey Neal,?
?
Yeah, I was thinking your end result would be something similar to a script that reads each line of each text file and loads them into a dataframe. From there you would loop the dataframe and consume its data as inputs to your algo's logic. Reading the dataframe record by record would then serve as your data feed.?
?
Otherwise, I would ask GPT what programming languages (that you're familiar with), python libraries, tools, etc., that a trader can use to back-test a strategy and go from there.
?
I'm personally an Excel guy so I tend to import the text files into Excel and create various IF,AND,OR formulas to identify the instant some number of conditions are met and then compare that instance's stock price to the stock price some 5, 10, etc., seconds into the future. That approach can be helpful but certainly has its weaknesses.
?