¿ªÔÆÌåÓý

ctrl + shift + ? for shortcuts
© 2025 Groups.io

updateMktDepthL2 missing data points


 

Hello,

Trying to source market depth data on FX pairs, USD JPY in the below case.
In the below code,?self.series[reqId]?is just a python list that I append new ticks to.

def updateMktDepthL2(self, reqId, position, marketMaker, operation, side, price, size, isSmartDepth):
self.series[reqId].append((time(), side, position, price, size))

Then separately I have a running thread in charge of saving once in a while to a parquet file and resetting the list.

...
if
len(app.series[reqId]) >= number_of_rows_threshold:
print('saving')
try:
df = pd.DataFrame(app.series[reqId], columns=['time', 'bid', 'position', 'price', 'quantity'])
app.series[reqId] = list()
except ValueError as e:
print('Value Error', e)
...
df.to_parquet(parquet_file_name, compression='gzip')

The above code works fine most of the time, however once in a while, I get an exception:
Length of values (112071) does not match length of index (112105)

Trying to debug the Exception, and putting each column into a separate frame I get the below row count per column:
  • time 2151751
  • bid 2151825
  • position 2151830
  • price 2151876
  • quantity 2151876

How is that even possible?

Note the numbers are different between the exception?112071 and investigation?2151751+,
I believe because data kept arriving between the exception catch and during the debugging I performed.








 
Edited

For supporting data, during debugging of another session with the same bug, I managed to extract and save the list into a csv file, using the below code.
Nothing looks wrong in the csv though.

np.savetxt("series_output_1.csv", app.series[reqId], delimiter=",", fmt='%s')


 

Can you add 'operation' in your csv ?
I assume your 'bid' = IB 'side', correct ?


It is MANDATORY to use operation? to manage and ensure the integrity of your entries, otherwise you might very well end up receiving request to update a row you didn't inserted. Or insertion or a row you never removed (duplicate), doesn't explain your figures but area to consider
because what is surprising is to see entries with price =0 and volume = 0.


 

Thanks Gordon for your reply.

Yes bid is side, side doesn't mean anything to me, but "bid"==1 clearly means we are talking about bid side.
Quantity is size, because size is used in many other things in python and I would rather avoid any name overlap.

Essentially if price=0 and size=0 it implies operation=2, and with regards to insert or update, it doesn't make any difference to the analysis I am performing.
Considering the amount of data to be stored on multiple FX pairs, I would rather not have a redundant operation column even if only uint8.

The other thing is, if it was related to the operation field, then we would not have any discrepancy between "bid" and "position",
but we have bid/size 2151825 and?position 2151830 which is rather confusing.

Also, how can price and quantity/size have a higher count 2151876 than time 2151751??


 

Looking at Ewald's ib_insync implementation below, it does essentially the same thing,
which is put all info in a tuple, and append that tuple to a list, so I am surprised that no one has encountered this issue,
but I am even more surprised that this issue even exists in the first place, as it should work, always, no matter what.


? ? def updateMktDepthL2(
? ? ? ? ? ? self, reqId: int, position: int, marketMaker: str, operation: int,
? ? ? ? ? ? side: int, price: float, size: float, isSmartDepth: bool = False):
? ? ? ? # operation: 0 = insert, 1 = update, 2 = delete
? ? ? ? # side: 0 = ask, 1 = bid
? ? ? ? ticker = self.reqId2Ticker[reqId]
?
? ? ? ? dom = ticker.domBids if side else ticker.domAsks
? ? ? ? if operation == 0:
? ? ? ? ? ? dom.insert(position, DOMLevel(price, size, marketMaker))
? ? ? ? elif operation == 1:
? ? ? ? ? ? dom[position] = DOMLevel(price, size, marketMaker)
? ? ? ? elif operation == 2:
? ? ? ? ? ? if position < len(dom):
? ? ? ? ? ? ? ? level = dom.pop(position)
? ? ? ? ? ? ? ? price = level.price
? ? ? ? ? ? ? ? size = 0
?
? ? ? ? tick = MktDepthData(
? ? ? ? ? ? self.lastTime, position, marketMaker, operation, side, price, size)
? ? ? ? ticker.domTicks.append(tick)
? ? ? ? self.pendingTickers.add(ticker)


class MktDepthData(NamedTuple):
? ? time: datetime
? ? position: int
? ? marketMaker: str
? ? operation: int
? ? side: int
? ? price: float
? ? size: float


domTicks: List[MktDepthData] = field(default_factory=list)




 

On Sat, Oct 28, 2023 at 08:20 AM, John wrote:

Essentially if price=0 and size=0 it implies operation=2, and with regards to insert or update, it doesn't make any difference to the analysis I am performing. Considering the amount of data to be stored on multiple FX pairs, I would rather not have a redundant operation column even if only uint8.

The other thing is, if it was related to the operation field, then we would not have any discrepancy between "bid" and "position", but we have bid/size 2151 825 and?position 2151 830 which is rather confusing.

I won't comment as to whether or not your deduction here is correct or not.

But, I will say that it seems you may be relying on too many assumptions. After all, it's been said that "premature optimization is the root of all evil (or at least most of it) in programming".

Therefore, I would suggest you try two things... 1) make your implementation as "correct" as possible according to the API specification & documentation without taking any shortcuts based on assumptions and 2) continue to use ib_insync as a comparison to see if you can reproduce the problem there.

Finally, since code interactions can get quite complicated, copying around snippets that are surrounded by ellipsis will only go so far. So, I'd try to make the smallest possible (but complete) standalone unit test in order to really isolate the issue.

Having that will also make it easy to present an argument to support people. They can then confirm the existence of bug(s) for themselves. Unlike the justice system, I think any application that exhibits "strange behavior" is presumed guilty until proven innocent (instead of the other way around).

Once this is done it should become very clear where the problem lies. And, you'll be able to go back and optimize more systematically.


 

when i implemented support for market depth myself, i recall i also had some issues with processing the incoming data. i kept just the book and updated it with the incoming data, but iirc the issue was that sometimes i had to delete an item with index that did not exist. i wasn't sure whether the issue was in my app or the sequence was really broken back then... anyway, in this case, i would even wonder what is the real value of market depth data on forex market.


 

BTW, since you mention this:

On Fri, Oct 27, 2023 at 02:54 PM, John wrote:

separately I have a running thread in charge of saving once in a while

I'd like to emphasize that the following is an order of magnitude more important:

On Sat, Oct 28, 2023 at 05:38 PM, buddy wrote:

code interactions can get quite complicated, copying around snippets that are surrounded by ellipsis will only go so far


 

Thank you both.

So, I'd try to make the smallest possible (but complete)?standalone unit test?in order to really isolate the issue.
@Buddy, these code snippets are pretty self explanatory and stand alone really, put the depth bars in a tuple, append the tuple to a list, put the list in a dataframe and save it to a parquet file. Even using a small toy example below with the same logic, there's clearly no way this code flow can fail on its own even if the input is trash. However you might be correct that it could come from an unfortunate thread concurrency, i.e. a huge amount of data being unequally added to the tuple/list at the exact time the dataframe is created, so I might simply need to hard copy the list before converting that copy to a dataframe instead of the list itself. I'll try that next week and keep the group posted.

from time import time
import pandas as pd

a = list()
a.append((time(), '1', '2', '2'))
a.append((time(), '1', 0, '3'))
a.append((time(), '1', '1'))
a.append(( 0, 0, None))

df = pd.DataFrame(a, columns=['time', 'col1', 'col2', 'col3', ])
print(df)

output
:
? ? ? ? ? ?time col1? col2? col3
0? 1.698522e+09? ? 1? ? ?2? ? ?2
1? 1.698522e+09? ? 1? ? ?0? ? ?3
2? 1.698522e+09? ? 1? ? ?1? None
3? 0.000000e+00? ? 0? None? None

anyway, in this case, i would even wonder what is the real value of market depth data on forex market.
@ fordfrog, well that's precisely what I am trying to find out! :) but for this I first need to extract and save a substantial amount of data in order to study it.


 

or maybe better use a semaphore or lock to do things properly, as copying the list won't prevent another potential conflict when resetting the list to list()


 

On Sat, Oct 28, 2023 at 08:05 PM, John wrote:

or maybe better use a semaphore or lock to do things properly, as copying the list won't prevent another potential conflict when resetting the list to list()

Yes, I'm not sure if a deep copy alone will suffice. But... this is what I meant by "make the smallest possible (but complete) standalone unit test". The test should focus on experimentally confirming your uncertain suspicions; it's a judgement call. Naturally there's no need to confirm what you already know.


 

When I read your initial post, my first thought was "data corruption due to multi-threading" and it looks like that is where the discussion with Gordon Eldest, buddy, and fordfrog is coming to.

Level II data for active instruments can generate a lot of data and, more importantly, bursts of hundreds even thousands of callbacks in peak seconds. Take a look at this post where I shared some stats for our setup. We have days with 2,000,000 callbacks withing five minute windows which is a sustained rate of 6,000 callbacks per second for the entire period.

I am not sure how busy USD.JPY is and whether you subscribe to more than one pair, but it is safe to assume that you will have peak seconds with tens or hundreds of callbacks. On the other side, your code seems to have no synchronization between threads at all. So it is just a matter of time until corruption happens.

You pointed to some code from Ewald's ib_sync . If you look a little closer to the entire library you will find carefully placed synchronization statements that make sure no data corruption takes place and a design that eliminates globally shared state as much as possible.

You will find several posts in the archive about multi-threading, how to do it and that its hard. And there is a lot of truth to the various contributions but let me share some thoughts on architecture and design for (massively) multi-threaded TWS API clients that has worked very well for us and yielded rock stable applications with little or nor data access synchronization. We even have certain applications with 100++ threads that connect to several TWS/IBGW instances simultaneously and process the individual "data firehouses" flawlessly on a multi-processor machine that executes many of them really in parallel.

Granted, we develop in Java and the language has a rich set of concurrency features, but all programming languages (including Python) have add-ons or libraries with similar or identical functionality. The following should, therefore, apply equally to all TWS API implementations (and multi-processing applications in general).

We had the following goals for our applications:
  • Any application can execute code from any of our libraries in as many parallel threads as it is useful.
  • Synchronization can be complex to program correctly, is expensive at runtime, and significantly reduces the effective parallelism for mulit-threaded applications. Therefore, by design, code needs to eliminate the need for locking/semaphores or other synchronization by eliminating all (but at least the vast majority) of globally shared data.
  • The only acceptable synchronization is when a thread runs out of "things to do" and needs to wait (you guessed it - not sleep!) until more data is available

So here is what we do ...

Ruthless separation of concerns, strict need-to-know-only, and only immutable objects

At the architecture level we break classes and modules into the smallest possible pieces so that all functions are related to just one topic/domain/subject. Instead of having an EWrapper object accessible to all functions in the application, we have a thin layer of code that wraps TWS API into a "controller" paradigm and groups the various functions into approx 50 small task oriented interfaces such as: Account, Position, Contract, Order, TickByTick, MarketDepth, ... Also, each interface defines classes for the data returned by TWS API callbacks so that each request returns a single immutable object that carries all parameters from the callback. In Java, these classes cannot be extended and instantiated objects cannot be modified once created. This way, objects can be shared side-effect free with many different modules in many parallel threads.

The controller completely hides the existence and details of TWS API (such as requestIds) so that the application code only deals with high level objects along the lines of the domain interfaces. The signature of request calls does not have a requestId parameter any longer but callers provide a handler object instead that receives all callbacks and errors for just that request. In other words and closer to your problem, even multiple MarketData subscriptions are completely separate from each other in that the application provides unique handler objects for each of them.

The Java implementation of TWS API ships with an ApiController class that shows how this can be implemented. And the good news is that you can develop your controller over time since it only needs to support the requests your applications actually use.

No global state and data streams

The main reason why a multi-threaded application needs locks/semaphores is to prevent data corruption when multiple threads simultaneously modify some global state or data.

In your case, you have a shared global object that is not only modified by the updateMktDepthL2 callback [self.series[reqId].append( ...] and the thread that saves the data [app.series[reqId] = list()] but also by several independent market data streams for different instruments [e.g. indexed by reqId]. If you try to solve your issue with locks and semaphores, that object will become the application's bottleneck since it will be locked all the time effectively reducing parallelism to just about 1.

One way to eliminate the need for synchronization is to eliminate the global series object entirely:
  • Instead of storing the individual parameters from TWS API callbacks in lists with columns, define a class that carries all parameters for the callback and creates immutable objects that can be freely shared by all parts of your application.
  • Separate the data streams for multiple subscriptions/reqIds from each other so that they can operate in parallel and are not even aware of each other
  • Define a "consumer" interface that all classes implement that need to work with the data. In your case that could be
    • a MarketBook class that immediately handles the updates without having to store them in a list and that only remembers the last price and size update for each sell side and buy side slot (as @fordfrog suggested)
    • a FileLogger class that consumes the objects one at a time and saves them to file. This class could simply take advantage of "file output buffering" or keep a private list of a certain number of objects before it writes them to disk. That list is local to the logger and requires no synchronization. The logger class could also hide the fact that the logger actually runs in a separate thread. It would simply create a queue that the consumer interface feeds with data objects and that the logger thread reads and stores to file at its leisure and possibly with a lower priority than the real-time stream consumers.
    • There could be other users of the data that would simply implement the "consumer" interface as well.
    • Since each instance of the data is immutable, one and only one copy of the object exists at any point in time regardless of how may consumers or threads receive a copy.
  • The controller keeps a list of consumers for each callback (such as MarketBook and FileLogger) or you could create a replicator class that receives objects from the API Controller and forwards them to one or more real consumers. We have that, for example, for TickByTick data where the replicator transparently forwards the objects provided by callbacks to several consumer streams:
    • A logger that serializes objects into Json representations and saves them to GZIP compressed files. We have a central generic logger that can handle all object types and that remembers the temporal sequence of all objects from the various streams. That way, we can replay a session in the exact order of events it took place in real time and analyze after the fact, how events for one instrument foreshadowed changes for another instrument. Or we can measure the request/response times since the logger keeps Instant time stamps with nanosecond resolution (thought our clock more realistically has microsecond precision).
    • Several filter, aggregator, and processor streams that calculate and determine interesting details from the data stream. The results are published as streams again so that they can be consumed by multiple modules possibly in multiple threads and have to be calculated only once.
    • The trading logic that takes the raw TickByTickLast and TickByTiclBidAsk data as well as filtered and aggregated information into consideration
    • The Position manager that keeps its own view of P&L and current market value of positions. This is independent from what TWS API provides since our requirements cannot be satisfied with TWS API data that resets P&L values some time in the middle of the night.
    • The order manager so that it is aware of the most recent bid, ask, and trade prices when it is time to place new orders
    • ...
    • This sounds like a lot of code but it actually is not. Proper structure and architecture? reduces the actual amount of code as you also can see within Edward's ib_insync. Applications simply "wire up" the various data streams with predefined or custom modules that comply with the consumer and provider interfaces. Depending on the application needs, data streams can execute within a single thread, use a a fixed pool of threads, or have dedicated threads for some streams without having to worry about locking shared or global data objects.

This became a little longer than initially intended but hopefully gives you some food for thought for a more powerful and scalable way to solve the data corruption and deal with MarketDepth fire hoses.

´³¨¹°ù²µ±ð²Ô


On Sat, Oct 28, 2023 at 02:44 PM, John wrote:
@Buddy, ... However you might be correct that it could come from an unfortunate thread concurrency, i.e. a huge amount of data being unequally added to the tuple/list at the exact time the dataframe is created, so I might simply need to hard copy the list before converting that copy to a dataframe instead of the list itself. I'll try that next week and keep the group posted.


 

my design is pretty the same, except much feature-less than what ´³¨¹°ù²µ±ð²Ô and his buddies implemented.

if i should write it as simple as i can, the design is this:
  1. i have my own tws api client implementation:
    • it is heavily multithreaded (as it was cheap for me to implement), with the respect that there is only one "pipe" between my api implementation and ib gw where messages go through.
    • for multithreading i use queues heavily, so that the items are always sorted as they came (at least within that topic queue, like ticks, order messages etc).
    • for incoming messages, i have queues for each topic (so that i keep the order of the messages as they came in) and process them one by one in a separate thread. this is very important, not to change the order of the messages within the topic.
    • for outgoing messages, as there is only one pipe out, i process the messages as i pass them in the message processor, with the only distinction, that i have a priority queue (which is always processed first) and a standard queue (which is processed once the priority queue is empty).
    • for each request type (with some exceptions, like order requests and messages are handled in single class as they all originate from placing an order) i have single class which implements preparation and sending of the request, processing of responses and incoming events, and handling of corresponding errors.
    • i have a single class (called TWSService) which implements various providers from my abstraction library and directs the requests to specific handlers. it also takes care of the connection handshake, monitoring the connection state (and notifying it outside via listeners), disconnecting, and automatic reconnection, if configured so.
    • the requests are passed to this implementation with uuid request id which is a unique id within my abstraction library, and it is mapped to tws api request id.
  2. i have an abstraction library that hides all the specifics of provider apis (like tws api, so that i could change data feed for example without changing anything in my code, just adding new api implementation) and provides all the features in the way that is more suitable for my consumers (apps):
    • i have connectors that manage related stuff for each specific topic (like accounts, contracts, data, orders, positions...).
    • i have provider interfaces that make standard communication interface for communication with api implementations (currently the only implementation i have is my tws api implementation). each connector gets one instance of a provider implementation to communicate with. in other words, the connector encapsulates the provider to build some more features on top of the provider, if needed.
    • some connectors provide some logic for stuff that is not api specific but rather my library specific, so if i would use different api later, the logic would/should work the same for the other api.
    • i heavily use listeners to pass received data to the consumers. this all happens asynchronously, as the request travels to my tws api implementation and to ib gw and from ib tw the messages travel through various queues back to the consumers (using standardized interfaces or objects from my abstraction library).
    • everything that would/could cause blocking (or does not need to be processed in-process) has its own thread (like data serialization, logging, ...) and queues so that the processing thread is as fast as it could be.
    • serializers serialize the data in batches. that is, they read the queue while there are items in it and only once the queue is empty, the transaction is committed (i use databases).
  3. and then i have various consumers (apps):
    • they instantiate TWSService instance with specific parameters (host, port, client id) and pass the instance to a specific connector (which accepts a provider interface, TWSService implements those provider interfaces).
    • then all the communication is done through my abstraction library using the connectors (to send requests) and listeners (to receive messages).
    • the consumer also listens to the connection change interfaces directly on TWSService.
this is pretty much the general concept of my infrastructure.


 

IDK how much more of this you want to read about. But, I'll just try to bolster the crux of ´³¨¹°ù²µ±ð²Ô's reply by referring you to Wikipedia regarding the .

I'll also add that the "first approach" they describe (re-entrancy, local store, immutable objs) is often the approach people tend to overlook. This may happen because one doesn't need extra tools for the first approach, it's basically a "style". And, I suspect if you show someone a "thing" (mutex, semaphore, etc) they remember it more than if you describe a "way" (don't share data, put everything on the stack, etc).

Well, I'm not an educator so that's all conjecture.

The second approach, otoh, can't be avoided in some cases... often when dealing with physical resources. Anyway, you should choose the approach which suits your given circumstances. Only experience and judgement will help you know which. I'd err toward the first approach if there are any doubts. But again, the first approach doesn't often come naturally until it's practiced a bit... and sometimes it's not worth the extra mental gymnastics.

As usual, YMMV.


 

Thank you Jurgen, fordfrog, buddy for your replies, they are brilliant.

I realize now I wasted this forum's time on a data corruption issue due to multithreading, I hope these answers will be useful to others nevertheless.
This FX depth script is only for research purposes, the idea is just to collect a few day's worth of data and analyze it to see what alpha can be found if any.
So for now I used a bool flag that turns off writing for 0.1 seconds every hour during file saving, that's all and it works nicely for this humble purpose of sample data collection.

If/once I establish that this is something worth trading in production, I will definitely take a more robust approach for data collection, and will apply this feedback.
I have a larger app already in place for live trading which does have some of the suggested points, but it is incorrect on many levels raised here.
The below was especially useful to understand the right design approach, so thanks for that.

One way to eliminate the need for synchronization is to eliminate the global series object entirely:
  • Instead of storing the individual parameters from TWS API callbacks in lists with columns, define a class that carries all parameters for the callback and creates immutable objects that can be freely shared by all parts of your application.
  • Separate the data streams for multiple subscriptions/reqIds from each other so that they can operate in parallel and are not even aware of each other
  • Define a "consumer" interface that all classes implement that need to work with the data. In your case that could be
    • a MarketBook class that immediately handles the updates without having to store them in a list and that only remembers the last price and size update for each sell side and buy side slot (as @fordfrog suggested)
    • a FileLogger class that consumes the objects one at a time and saves them to file. This class could simply take advantage of "file output buffering" or keep a private list of a certain number of objects before it writes them to disk. That list is local to the logger and requires no synchronization. The logger class could also hide the fact that the logger actually runs in a separate thread. It would simply create a queue that the consumer interface feeds with data objects and that the logger thread reads and stores to file at its leisure and possibly with a lower priority than the real-time stream consumers.
    • There could be other users of the data that would simply implement the "consumer" interface as well.
    • Since each instance of the data is immutable, one and only one copy of the object exists at any point in time regardless of how may consumers or threads receive a copy.


 

I would not characterize your post as wasting the forum's time, John. I think it caused a nice collection of valuable information that already helped you and will likely help others when they interface TWS API with a multi-threaded application or attempt to implement completely asynchronous and event driven applications (as they should).

This is not that hard, but also not intuitively obvious. And as @buddy pointed out and @fordfrog reiterated, it is mostly about style and structure (architecture, design) and disciplined coding.

@fordfrog your description does not sound any less feature rich than what I an using. Pushing multi-threading to a lower level with topic queues is the right way to go and one of my roadmap items. And I fully agree with the use of provider, consumer, and connector interface abstractions to hide the TWS API details, for improved overall modularity, and to manage information flow even in applications that are not (yet ?) multi-threaded.

´³¨¹°ù²µ±ð²Ô


On Tue, Oct 31, 2023 at 02:10 PM, John wrote:
Thank you Jurgen, fordfrog, buddy for your replies, they are brilliant.

I realize now I wasted this forum's time on a data corruption issue due to multithreading, I hope these answers will be useful to others nevertheless.
This FX depth script is only for research purposes, the idea is just to collect a few day's worth of data and analyze it to see what alpha can be found if any.
So for now I used a bool flag that turns off writing for 0.1 seconds every hour during file saving, that's all and it works nicely for this humble purpose of sample data collection.

If/once I establish that this is something worth trading in production, I will definitely take a more robust approach for data collection, and will apply this feedback.
I have a larger app already in place for live trading which does have some of the suggested points, but it is incorrect on many levels raised here.
The below was especially useful to understand the right design approach, so thanks for that.

One way to eliminate the need for synchronization is to eliminate the global series object entirely:
  • Instead of storing the individual parameters from TWS API callbacks in lists with columns, define a class that carries all parameters for the callback and creates immutable objects that can be freely shared by all parts of your application.
  • Separate the data streams for multiple subscriptions/reqIds from each other so that they can operate in parallel and are not even aware of each other
  • Define a "consumer" interface that all classes implement that need to work with the data. In your case that could be
    • a MarketBook class that immediately handles the updates without having to store them in a list and that only remembers the last price and size update for each sell side and buy side slot (as @fordfrog suggested)
    • a FileLogger class that consumes the objects one at a time and saves them to file. This class could simply take advantage of "file output buffering" or keep a private list of a certain number of objects before it writes them to disk. That list is local to the logger and requires no synchronization. The logger class could also hide the fact that the logger actually runs in a separate thread. It would simply create a queue that the consumer interface feeds with data objects and that the logger thread reads and stores to file at its leisure and possibly with a lower priority than the real-time stream consumers.
    • There could be other users of the data that would simply implement the "consumer" interface as well.
    • Since each instance of the data is immutable, one and only one copy of the object exists at any point in time regardless of how may consumers or threads receive a copy.


 

I don't know enough about Python writing speed, but I am not sure you need multi-threading, because IB does buffer answers for you (to a limited extent but should be enough even with L2)
so you might make your life simpler doing your experiment with save (append) in same thread, as IB does hide for you it's own receiving thread.
Also you ought to to save 'Operation' in your CSV the only safe way to handle case 1,2 at a later stage.


 

@fordfrog your description does not sound any less feature rich than what I an using. Pushing multi-threading to a lower level with topic queues is the right way to go and one of my roadmap items. And I fully agree with the use of provider, consumer, and connector interface abstractions to hide the TWS API details, for improved overall modularity, and to manage information flow even in applications that are not (yet ?) multi-threaded.
@´³¨¹°ù²µ±ð²Ô what i had on my mind is that your level of professionalism (which manifest through features like the extensive logging, possibility of replay, data analysis, overall debugability and many more) is much higher, the level that i would like to once achieve too, you inspire me :-)


 

today i was writing some tests for my own tws api client implementation and i encountered a new message that i did not notice before (i don't work with market depth data for quite a long time though) and it is not mentioned here in this thread anywhere. the message is (copied from my log):

INFO: Received message: ERROR_MESSAGE[2023-11-14 12:57:13.422]: ---n4-2-1-317-Market depth data has been RESET. Please empty deep book contents before applying any new entries.--

so if i get it right from the message text, immediately after this message is received, one should empty the market depth book and start from fresh with empty book. without respecting this message and cleaning the book, the book becomes broken.


 

with the new message 317 (book reset) which is related to market depth, i found a "minor" flaw in my design. currently, i put all the info messages in a general queue, but that is not correct. the messages should be put to the topic queue based on the topic they are related to (like market depth topic). it is even more important in this case, where one receives market depth items, then reset message arrives, and then new market depth items arrive. if the book reset message is processed in a different thread, it is not guaranteed that the reset will occur at the correct order. hence, error messages should be processed in the topic thread they are related to to preserve the order in which the messages should be processed.