alx_ 6 years ago

I did a bit of digging in the repository. And I didn't like what I found.

First off, this project seems to be based on:

https://github.com/compose/transporter

But of course every mention of transporter or compose.io was removed from repository, including original BSD-3 license. Which is clearly a violation of BSD-3 license terms.

Guys, this is not how open source works. You shouldn't try to claim all the credit when your work is clearly a derivative of another work.

And this isn't even the worst thing about the project. While it claims to be an open-source software, almost all of its functionality has been removed from the source code.

Thus as an open source software it's completely useless. If you build it from source it probably does nothing, because syncing code is closed source.

Yeah, I know you can download pre-built executable. But I won't ever be running some closed-source freeware/commercial software in production.

No, thank you very much. Especially if aforementioned software was written by someone who completely lacks any notion of ethics.

  • newlyretired 6 years ago

    I second this. If a company is building their product off of other people's work, I believe they need to honor the spirit and intention of their predecessors.

    From the comment below, it sounds like maybe an Apple-like strategy to build off open source but charge for the resulting product? Either way, I would not feel comfortable recommending this product.

  • sidi 6 years ago

    Addressing to your point directly, the project doesn't claim to be open-source https://github.com/appbaseio/abc#licensing and the same is also mentioned in the blog post. The part of the project that is open-source is available at https://github.com/appbaseio/abc (but isn't relevant to the post here).

    The import functionality is based on the transporter project[1], and to set the record straight - we will be adding the acknowledgement for the same in the next binary release[2]. However, we are not redistributing the source as `abc import` isn't open-source. For anyone interested in why we aren't straight up using the transporter project as is, there are changes we have introduced in the 1.) sink functionality, 2.) added adaptors for SQL variants and CSV, and 3.) made a more simpler interface. Going forward, we are more interested in the easiest way to sync <X> to Elasticsearch, which is different from transporter's goal of a generic ETL.

    I do appreciate you bringing this up. We're very much just getting this out there and want to do the right thing.

    [1] https://github.com/compose/transporter

    [2] https://github.com/appbaseio/abc/issues/77

    • alx_ 6 years ago

      Sure, there is some mention of "!oss". It's not 100% clear though. I sure missed it.

      You are redistributing transporter's source though, because it's avalaible from the github repository. Wtf are you even talking about? There is some code almost straight from transporter in your abc repository.

      Sure, you've made changes. Why should anyone be interested though if it's 100% closed-sourced? "free while in beta" my ass.

      • sidi 6 years ago

        We aren't redistributing transporter's source code. Where did you see that?

        • alx_ 6 years ago

          You are still redistributing source code that is derivative of transporter's code (like goja_builder.go and other stuff). Not to mention that anyone can just roll back commits in your repository and get to the original code of transporter, with original license, copyright and even authors list.

          I think that's a license violation, because you are not keeping an original copyright notice in repositor. If you think otherwise - whatever. I don't want to argue on technicalities.

          I think I'll just forward info to transporter's developers, so that they can handle the situation the way they want. You can argue with them or maybe they just don't care.

BrentOzar 6 years ago

I couldn't figure this out in the documentation - how are you keeping the data up to date? Is there some kind of scheduled refresh that pulls all the data from the database periodically, or how are you detecting which rows changed?

In particular, how are you implementing this in, say, the MSSQL importer?

> Adaptors may be able to track changes as they happen in source data. This "tail" capability allows a ABC to stay running and keep the sinks in sync.

  • TYPE_FASTER 6 years ago

    I’m assuming they’re adding a trigger to the source table.

    • BrentOzar 6 years ago

      > I’m assuming they’re adding a trigger to the source table.

      Their video demo for SQL Server simply points the command line at a connection string, and it then says 2 item(s) indexed. There wasn't a way to pick specific tables.

      That means if the trigger method is true, they're adding a trigger by default to every single table in the database. That would be a remarkably bad idea.

    • Mister_Snuggles 6 years ago

      The PostgreSQL one uses the logical replication feature, perhaps the MSSQL piece does something similar.

sidi 6 years ago

Hi HN, we created ABC import as a convenient way to sync data source to an Elasticsearch index. It does three things really well imho:

1. A small footprint process / docker container that can index or sync your data source with ES that is operationally simple vs relying on application layer logic,

2. Supports on-the-fly transformations with Javascript, as well as configuration of mappings (so if you want to set a specific analyzer on your Text fields, or set type mappings),

3. Works with a wide variety of sources - Postgres, MySQL, SQL Server, MongoDB, JSON, CSV, Elasticsearch and more coming soon.

  • styfle 6 years ago

    A couple questions with regards to the SQL -> ES:

    1. Does it sync deltas or do you have to import the whole table each time?

    2. Does it listen for changes to a table or does it require a manual invocation?

    3. Can you explain more about the algorithm/implementation?

  • tedmiston 6 years ago

    Looks like a cool project.

    Does the real-time sync work for Postgres instances that are tens of GB and thousands of tables across schemas?

    • sidi 6 years ago

      Thanks! We haven't tested it at scale, but this is where we would ideally want it to be at v1. If you would like to take it for a spin, would appreciate any feedback on scaling limits.