QSIM (Quant Signal Manager) is an internal execution tool for working with trading signals. It brings together:
so researchers and downstream systems can query, combine, and act on enriched signals at scale.
My role was to design and implement the end‑to‑end data pipelines: from Reddit ingestion and LLM‑based sentiment analysis to asset tagging and storage in ClickHouse, plus the core libraries that make this data easy to consume.
One of the main problems in earlier systems was the reliance on large proxy pools to collect Reddit data, which was expensive and brittle.
I designed and implemented a new Reddit parsing system that reduced proxy costs to absolute zero:
QSIM was split into several repositories:
qsim-core — shared domain logic and interfacesqsim-executor — execution tooling and orchestrationqsim-data — data access, models, and ETL utilitiesI was responsible for the core and data modules and for making them integrate cleanly across repos:
qsim-core is imported
by both qsim-executor and qsim-data.A key goal was to make it easy for downstream tools and notebooks to work with large volumes of signal data.
sqlalchemy for relational sources,clickhouse-connect for ClickHouse.