theamk 5 years ago

So I was wondering, how do you do two-way sync with inheritly one-way tool like rsync? And turns out you cannot do it, not reliably.

The way script works, it runs change monitor on both sides; if there is a change on local side, it will do local->remote sync; if there is a change on remote side, it will do remote->local sync.

This can go wrong in many, many ways. Here is the first example that came to my mind: you started a process on remote machine which creates lots of small files -- maybe extracting an archive, or generating images. So the syncer keeps syncing those files in remote->local direction. Meanwhile, you got bored watching the script and decided to edit some code. POOF! Any edits you make are continuously reverted.

Oh, and there is no error checking anywhere. Did your network had a hickup? Tough, we will march on anyway. Let's it was not in lines 101 or 104 -- if these commands have transient failures, then your newly made changes would just get reverted.

If you care about your data, please do not use this. Use anything else -- syncthing, osync, unison were named in this thread, they are all good.

  • headgasket 5 years ago

    (author here) Hey thanks for the scrutiny! I'm using the -u option in rsync, preventing overwriting newer changes. So in your described case, it will signal a change remote (a lot of changes, but collapsed every 3 seconds) sync remote-local first then local-remote, both with the u option, so not updating if there's a newer version at the other end. No POOF, unless I'm mistaken. Cheers!

    edit for clarity The trick is every sync step is with -u (not replacing updated files) and done again in the other direction before restarting the watch.

    • theamk 5 years ago

      But you still have --delete there, and -u won't affect it. So if you create a local file, it will be deleted (it might stay if by chance it was created during second rsync, but this is not guaranteed).

      And the same deletion applies to network errors -- if ssh fails, then newly created files would get deleted.

      That said, I agree that -u options does makes it less likely that files get overwritten. This option approach a some caveats, however -- archive extraction will mess the modification time, symlinks and directories are not handled properly. Still, regular IDE editing will work.

      Still not sure how is it better than unison though :)

      • headgasket 5 years ago

        Good thinking. You are right that there's a short window between the 2 rsyncs that allows for a newly-created file being created at say the remote end just after (or maybe during? --would need to test) the remote-local rsync that would get deleted by the subsequent local-remote rsync call.

        Since there's no lock or anything this vanishing new file just after creation is a possibility. I just wrote this yesterday, I'll be using it extensively, I'll see how annoying this is, and if there's something to do about it.

        As a side note, ssh failure has not been a problem (yet), since the script does the same strategy when starting up. In fact I kill and restart this script a lot. I havent played with archive extraction, this is mainly for source code editing.

        edit

        It would seem that a small modification to the --delete behaviour of rsync to only delete files at the other end that are older than say 30 seconds would handle this edge case. I'll see how annoying it is and if it warrants the time to investigate this.

  • api 5 years ago

    Are there any good open source live sync tools? Even one way sync with rsync is slow and two way is as you say a kludge. Seems like there is not much open source stuff out there for this and certainly nothing that can sync like these cloud drive apps.

    • theamk 5 years ago

      Well, I like "unison" -- it is a well designed, two way sync app.

      I personally run it in "-auto" mode -- every time I run the program, it shows me all actions it wants to do, any conflicts it detected, and asks me if I want to proceed.

      If you want to run live mode, you can just run:

          unison -ui text -repeat 5 . ssh://remote-host/dir
      
      it will check for changes every 5 seconds, sync over any changes, and skip all the conflicts. You'll have to re-run it without -repeat option to resolve the conflicts.
      • api 5 years ago

        I tried it but it never seems to work all that well for me. It also has terrible UX with confusing options.

        • theamk 5 years ago

          Unison's UX has options? I did not know.

          I recommend treating unison like you do rsync -- read the manpage, configure it with config files and shell wrappers.

          If you run it with "-ui text -auto", then it will print the list of changes, and ask: "do you want to proceed (y/n)", which is not that confusing.

          Treat it as mostly text-based syncer with UI as an extra bonus, and it will be much less confusing.

  • vegardx 5 years ago

    You basically can't get around this without distributed locking.

    • theamk 5 years ago

      Locking is not needed as long as you have conflict detection. It might be a bit annoying, depending on UI, but at least it won't result in data loss.

      (Note that local locking can be helpful to prevent simulatenous modifications by editor and sync tool; but unfortunately Linux ecosystem is not designed with such locking in mind)

rasengan 5 years ago

Cool project! Another alternative is to just use unison. Not only is it cross platform for windows, Mac and Linux, but on top of that it “just works”.

Amazing stuff. Brew, Cygwin and your favorite package managers have it.

  • NelsonMinar 5 years ago

    Seconding unison. It's kind of a forgotten tool but it's remarkably robust and good at what it does. (I have to think being written in OCaml has been part of why it doesn't have more of a community around it.)

  • beagle3 5 years ago

    When I checked a few years ago, unison worked but was (a) relatively slow, and (b) very picky about which versions are at which end.

    almost any two rsyncs from the last 20 years speak to each other and do it rather quickly.

    • theamk 5 years ago

      For picky versions, there is nice `addversionno = true` option, which will let you have multiple unison versions installed simultaneously. It will always pick the right one.

      I agree with slowness -- I feel the recent versions are pretty fast on Linux, but it can be faster. Also it's "scan, then transfer" approach means it can take lots of memory if there is a lot of files.

      Still, if you can your rsync in your application, use it. For automated backups, you cannot beat it. But unison has two unique advantages:

      - Proper two way sync -- doing it safely with rsync is almost impossible.

      - GUI/TUI which shows what changed and allows conflict resolution.

      - GUI which shows

      • beagle3 5 years ago

        Indeed. But you still need multiple versions installed, which means you have to sidestep package managers (and generally common usage practices); which is really inconvenient if you have multiple systems with different os’s and distros.

  • nine_k 5 years ago

    Also, syncthing. It seems to be better maintained. It does not preserve special file attributes and symlinks, though.

  • headgasket 5 years ago

    I had never come across that one either, thanks for the link! It does look a little on the heavy side for my use case, but it might come in handy at some point! It looks like it has all the bells and whistles. Cheers.

senorsmile 5 years ago

I highly recommend osync(1). It has been around a while and has had quite a few users report bugs. The maintainer is active and very helpful. I've used it for a few years now with great success.

1: https://github.com/deajan/osync

  • headgasket 5 years ago

    (author here) Interesting, thanks for the link! I had never seen this one. It does look a bit more heavyweight; that might be a good thing for some use cases. I currently use this on many different containerized setups, so simpler and less setup/config is better for me at this point. Cheers!

m-p-3 5 years ago

Nice system, but I think I'll stick to Syncthing for the time being.

systemspeed 5 years ago

These sorts of solutions are always tantalizing, especially when developing on a Mac but deploying on Linux, but the killer feature for such an application would be IDE/ST3/Atom/VSC[/etc...] support. If this sync process could be orchestrated from such a development environment directly, it would avoid many design pitfalls, such as degenerating into rapid fs polling.

  • headgasket 5 years ago

    hey! thanks for your comment! I wrote this yesterday because I wanted the read speed of a local copy, hot-reload of the remote side app, and the ability to edit at both ends. I came across fswatch, I've been quite impressed; it does a pretty good job resource-wise. On linux it uses ionotify; I have not seen it degenerate into rapid fs polling (yet). Cheers!

    • systemspeed 5 years ago

      It's not a critique of your design. You solve the problem as you see it pretty spot on.

      It's more so that I wish IDEs supported software like this. There's a plethora of such offerings such as CyberDuck, Expandrive, etc that would benefit from reduced read/seek activity if the IDE could just orchestrate when to emit changes to what it thinks is the "disk". As you noted on GitHub, such software gets really laggy when working in directories that aren't trivially small.

orliesaurus 5 years ago

I use a few rsync scripts and build shortcuts, but gonna try yours honestly!

mamcx 5 years ago

Related, any option to sync files that work from iOS/Android (excluding dropbox, just to make things hard)?

  • headgasket 5 years ago

    well if you can get bash, ssh, fswatch, rsync and socat we could get something going... ;-)

    For iOS coda is pretty good. I've used Codeanywhere for android; 2-way sync would need to be integrated in the app itself on these platforms, I would think...

techntoke 5 years ago

Do you feel that it is production ready and that the sync won't become corrupt? Any plans for Android?

  • headgasket 5 years ago

    Hum a day old might be early to say, but the sync is handled by rsync and that has been around for ages, corruption due to rsync would be surprising. (although hash and adler32 checksums collisions are theoretically possible.. :-)

    • techntoke 5 years ago

      It's not rsync I'm concerned about. It is fswatch and adding/deleting multiple files (possibly from multiple devices) and having them automatically synced.

      • headgasket 5 years ago

        More than 2 end points would be a bridge too far for this simple script.

        With regards to fswatch, from what I gather, it blocks a iotcl call on all the files in your watched folder. This script fires rsync (always from the local end to prevent confusion) as soon as a change is detected on either end, starting with the end where the change has been detected. Pretty simple. I guess if you change files and delete files at the same time on both ends, some deleted files might get recreated. At this point this is the worse I can see, but I could be very wrong... :-)

etaioinshrdlu 5 years ago

I have a few qualms with this app: 1. For a Linux user, you can already build such a system yourself quite trivially by using Dropbox.

2. It doesn't actually replace Dropbox.

3. It does not seem very "viral" or income-generating. I know this is premature at this point, but without charging users for the service, is it reasonable to expect to make money off of this?

(Silly satire.)

  • the_pwner224 5 years ago

    This can sync any folder and (presumably) can have multiple instances running; Dropbox is limited to a single instance per user syncing ~/Dropbox. This is P2P and works without an internet connection.

    It seems to be built for a different use case; Nextcloud (server/client) and Syncthing (P2P) are already excellent Dropbox alternatives.