Skip to content

support either nanosecond timestamp precision or composite primary keys #1597

@grenade

Description

@grenade

Describe This Problem

my use case is trade data analytics for signal or strategy indicators. a trade has:

  • timestamp: the time (in nanosecond precision) of the execution of the trade
  • base: the tag/symbol of the commodity or security that was bought or sold
  • quote: the tag/symbol of the currency, commodity or security that the price of base is given in
  • exchange: the tag/symbol of the exchange where the trade was executed
  • id: the (uint64) id used by the exchange to uniquely identify the trade. due to the volume of trades, most exchanges use a composite key of (id, base, quote), so there are always collisions if base, quote, exchange and id are not considered together
  • price: the price (in quote) for 1 unit of base
  • size: the quantity of base in the trade. a positive size indicates a buy. a negative size indicates a sell

i am struggling to document a table definition in horaedb that does not lose trades. since there are many trades which occur within the same millisecond. nanosecond precision appears to be unavailable.

i have attempted to work around the millisecond precision limitation by using a composite primary key (time, exchange, quote, id). each base gets it's own table which allows for multi-exchange series and combining quotes where their underlying values are equivalent (ie: usdt == usdc). however it appears that horaedb accepts the definition but ignores any value in the composite key that is not the timestamp. this results in all but the first trade in a given millisecond being discarded as a duplicate.

Proposal

i would appreciate ideas about how to utilise nanosecond timestamp precision or how to correctly define a composite key that will actually work.

Additional Context

here's the table definition i have tried:

CREATE TABLE IF NOT EXISTS {base} (
    time timestamp NOT NULL,
    id uint64,
    exchange string,
    quote string,
    price double,
    size double,
    TIMESTAMP KEY(time),
    PRIMARY KEY(time, id, exchange, quote)
) ENGINE=Analytic with (enable_ttl='false')

here's an example of source data where base is btc, quote is usd and exchange is coinbase:
https://api.exchange.coinbase.com/products/BTC-USD/trades?limit=10&after=1000

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions