Kdb+ and Python: EmbedPy and PyQ

osrec 6 years ago

I've never quite understood why banks still opt for kdb when (a) hiring staff is ridiculously difficult/expensive and (b) there are so many other cheaper alternatives available. They literally have a strangle hold on every eFX desk in London, and I cannot really figure out why!

geocar 6 years ago

Maybe you're wrong about (b)?
Sure, I could cobble something about as fast and functional out of postgres, C, awk, and a bunch of other things, but it'd certainly cost more just in developer time than kdb.
- osrec 6 years ago
  
  Your comment made me chuckle. In fact, made me immediately think you were employed by Kx at some point; your profile confirmed this was the case!
  I think Kx has built a wonderful business, but as an ex MD at a bulge bracket bank, I would not sanction its use any more. It is expensive, and it's difficult to hire good people to work with it (not every guy with kdb on their CV is great). The product is good, but so are many open source offerings, which are now matching it both in terms of performance and flexibility. Plus hiring good people is significantly easier and cheaper.
  
  dogruck 6 years ago
  
  As an ex-MD you also know that banks used to not be too price sensitive, so they didn't mind paying a lot for some Kx experts.
  Of course, these days, with revenues being squeezed, that's changed.
  
  osrec 6 years ago
  
  Agree, price is a much bigger factor now.
  
  oper8or 6 years ago
  
  What are the open source offerings that you referring to? The ones that provide 1) no-copy analytics 2) i/o stack bypass i.e. memory-map.
  
  kthielen 6 years ago
  
  The one I linked to in my previous post also produces code that's fast enough to use in the critical path in low-latency trading systems.
  As you may know, q's "scalar performance" is not great; similar to Scheme or python due to boxing overhead (as you can see in the linked k.h file below).
  Also, the fact that q is untyped has a severe impact on its safe use in large and complex projects.
  (ETA: https://github.com/Morgan-Stanley/hobbes)
  
  oper8or 6 years ago
  
  Thank you for your answers. To summarize, we have 1) Hobbes 2) hacked LMDB 3) C++ memory-mapped store of arrays.
  Given that options #2 and #3 require some (non-trivial) work, they are not really options.
  We left with #1,-- hobbes, which was uploaded to GitHub about 5 months ago and has a whopping team of 2 contributors, both employed by Morgan Stanley.
  This is more than nothing, but not much.
  I do not have experience with KDB, and looking at the language syntax, not a fan. Integration with Python (depending on implementation) may push KDB towards larger acceptance.
  So far I was mostly relying on a variation of the option #3.
  
  kthielen 6 years ago
  
  You must be able to come up with better criticism than that. Number of people who work on the project, time it's been on github, contributors' employers ... these are completely irrelevant to the question of whether this project is a viable alternative. There's not a single technical argument here! :)
  For what it's worth (not much), from a purely superficial standpoint, kdb itself started out as a one or two person project at Morgan Stanley! :D
  We've managed to get this thing right in the hot path (not just for analysis off on the side, though that use-case is important too) where a significant portion of global trading happens, in one of the biggest investment banks in the country, and we've had it working in production for four years doing this (before recently open sourcing), having had to make the technical case to many people who are very aware of kdb and what it can do (as far as kdb goes, Morgan Stanley is Mount Doom!).
  I mean, I take your point that it's not ubiquitous in the world yet, but in terms of the OP proposition that there are free and technically superior alternatives, it's proof positive.
  
  osrec 6 years ago
  
  There are a few. I've used a modified LMDB source in the past with success, employing similar tricks to kdb for performance (i.e. store daily data as contiguous arrays so that reads are quick etc). Either way, implementing a memory mapped store of arrays and operating on it is not too challenging a problem for any good C++ dev.
  
  alfiedotwtf 6 years ago
  
  Everyone is hanging on for OP to deliver
  
  yalph 6 years ago
  
  So what are those alternatives?
  
  kthielen 6 years ago
  
  How about hobbes?
  
  yalph 6 years ago
  
  I will check thanks but wnated to learn from him. I do not think there are many alternatives by the way. Is this one a full alternative? Does it provide all functionalities of kdb?
  
  kthielen 6 years ago
  
  It’s better in a lot of ways. It has a type system, can produce much better code, can do cross-process compilation (for multi-process IPC). The FFI/binding process is much simpler, there are more options to record precisely-typed data from applications.
  And hobbes is used in major high-volume and low-latency systems at Morgan Stanley (where q originated, as you may know).
  
  yalph 6 years ago
  
  Hi thank you very much, I work in the industry and I will definitely take a look at it. You seem to be one of the developers behind this project. Is there any way I can contact you in the future?
  
  kthielen 6 years ago
  
  Yes I do work on the project at Morgan Stanley. Our group email is hobbes-dev@morganstanley.com and we're pretty responsive on that list (as well as on the github page).
  
  yalph 6 years ago
  
  How come kx is ok with this one? I heard they sue every moving object around.
  
  kthielen 6 years ago
  
  Kx does seem to bully people with threats of lawsuits, not sure how often they actually follow through.
  hobbes is not a k/q clone, it's much more like Haskell actually. The features that make hobbes especially compelling for its production use-cases, like its complex type system, are features that kdb has never had and probably never will have.
- kthielen 6 years ago
  
  https://github.com/Morgan-Stanley/hobbes
dilap 6 years ago

huh, so what do i need to learn to be one of these ridiculously expensive people working w/ kdb?
- ah- 6 years ago
  
  I'd start with https://code.kx.com/q4m3/
  
  osrec 6 years ago
  
  Also, you need to be lucky enough to somehow move into a team that uses kdb heavily (this usually happens via an internal move within an investment bank). Learning it using the tutorials is not the same as implementing it in a live environment, and the tutorials probably won't be enough to get you a job/contract. If you could say something like "I've worked on the FX desk at JP Morgan, where we used kdb daily", you'll be hired everywhere. Basically, you need a bit of luck in the first instance to get into a kdb focused team that can train you - after that, you're sorted!
- beagle3 6 years ago
  
  Historically, an APL background was a good indication that you grok the apl/k/j/q mindset, which is very different from the Algol (c/c#/Java/pascal) world and also from the Lisp world.

bkeroack 6 years ago

I like it when kdb is mentioned because then I can post this link: https://github.com/KxSystems/kdb/blob/master/c/c/k.h#L96

It's one of my all-time favorites. A window into a certain type of mind.

chc4 6 years ago

I'm surprised you didn't link an actual c file; everything related to k seems to trigger an exorbitantly high number of "what the fuck"s. https://github.com/kevinlawler/kona/blob/master/src/0.c for example.
http://archive.vector.org.uk/art10501320 is one of my favorite articles, though
- RodgerTheGreat 6 years ago
  
  That article was enough to inspire me to write a K interpreter, and eventually land me a job working with K. If I ever meet Stephen Taylor in person I imagine it'll be an interesting story to tell.
  
  5jt 6 years ago
  
  I try to get to most of the Kx Meetups in London.
  
  kthielen 6 years ago
  
  That's great that your interest produced that result! When I made a K interpreter, Kx threatened to sue me and everyone I had worked for.
  
  beagle3 6 years ago
  
  That’s likely because yours was fast enough to threaten their sales, whereas RodgerTheGreat’s is JavaScript and can not.
  Nick Nickolov’s one also disappeared off GitHub, though Kevin Lawler’s Kona k3 implementation and Andrey Zholos’s jitted weird dialect are fast and still up; also nils holm’s klong.
- e12e 6 years ago
  
  I was kind of happy to find some java code this time around - who says java needs to be verbose?
  https://github.com/KxSystems/kdb/blob/master/c/jdbc.java
smnrchrds 6 years ago

I imagine this looks more intuitive than normal code to a mathematician: They are no strangers to complex notations and they prefer the brevity they offer.
- yalph 6 years ago
  
  Sort of a mathematician here, it looks like a disease.
zie 6 years ago

Oh my. that's just.. wow. Obviously done before code-review became a standard!
- hakanderyal 6 years ago
  
  This thread might give some insights about the coding style: https://news.ycombinator.com/item?id=13565743
- pjmlp 6 years ago
  
  Code-review is a cool thing in SV, trendy startups and the big industry related companies, sadly on most companies whose main business is completely unrelated to IT, just like unit tests it gets a spot just behind writing documentation.

mclovinit345 6 years ago

the idea that I could seamlessly move data between python and kdb+ is making me salivate. Please, make this work well and lower the cost-barrier to using kdb+ and I think you'd see this become the defacto "stack" for many database setups.

ah- 6 years ago

All this needs now is a free (at least commercial use allowed 64bit version, ideally OSS) kdb+.
I'd love to use this, many things are so much more straight forward in q than they are in pandas. But if it means noone can run my software unless they pay for kdb+ it's a non-starter.
I wonder if this would work with Kona.