Recent

Author Topic: Gauge interest/solicit advice in Open Source Project: MySync  (Read 943 times)

Pluto

  • New Member
  • *
  • Posts: 29
Gauge interest/solicit advice in Open Source Project: MySync
« on: March 24, 2021, 02:30:35 am »
MySync would be a super-simple command line tool that could be used to synchronize directories in a way that works superficially like git.  The logic of the code was a former commercial product that I wrote in C#, but will rewrite in Pascal. 

The idea is that, since just about anything can be mounted as a drive these days, I can meet the needs of 90% with this functionality....

> MySync clone <origin directory>

This clones the origin into the current directory, and also creates a .mysync directory (similar to how git creates a .git directory) to hold details about the sync process.

then... when you want to sync changes, you just run:

> MySync Sync

In the default scenario (no flags), conflicts are mentioned, but ignored.  They will reappear again each time you perform a sync, until you resolve them, either manually, or by using the interactive (-i) flag

MySync Sync -i

Options for a conflict are push to
  • rigin, pull to local [r]epository, or gnore.


Overwritten files from a conflict are stored off inside the .mysync directory, so that they can be later restored if need-be.

Like Git, there is a .mysyncignore file, which works like a .gitignore

Unlike Git, there is also a .mysyncretain, which works like a .mysyncignore, but instead of ignoring the file, it retains a version of the file (inside the .mysync directory) every time the file is replaced as part of the sync process.  In this way it mimics some of the functionality of a journaling filesystem.

I know there are other sync utilities out there, but I think the logic of mine is unique... Please correct me if I'm wrong.

I attached a pdf file to give a bit more details about the process itself, and how it figures out what to sync.  I've been using the logic for almost a decade now and its always worked flawlessly.


Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 501
  • Professional amateur ;-P
Re: Gauge interest/solicit advice in Open Source Project: MySync
« Reply #1 on: March 24, 2021, 03:46:06 am »
Hi Pluto,

This seams quite interesting.
May I ask:
  • What's your diff strategy for binary files?
    Is it different for text files?
  • If you don't have a diff strategy, is it only size, name, date?
  • If it's so like git, why not implement add and remove?
    These would add and remove files from being tracked, like git.
    Yeah I know you have ignore, but so does git and it still has a add/remove for tracking.
  • Can it work over an ssh connection?
    Like rsync can?
  • How much is it different from rsync?

Cheers,
Gus
Lazarus 2.1.0(trunk) FPC 3.3.1(trunk) Ubuntu 20.10 64b Dark Theme
Lazarus 2.0.12(stable) FPC 3.2.0(stable) Ubuntu 20.10 64b Dark Theme
http://github.com/gcarreno

af0815

  • Hero Member
  • *****
  • Posts: 669
Re: Gauge interest/solicit advice in Open Source Project: MySync
« Reply #2 on: March 24, 2021, 08:12:22 am »
It looks like a simpler Unison sync. It should handle two way sync, if i read the pdf well.

BTW:
rsync is only one way and can headless
unsion can two way and syncing between two servers to minimize the load on the connection and can headless

Interesting is the project if it can handle this like unison. And unison have found a lot of caveeats in sync between differnt systems.
regards
Andreas

Pluto

  • New Member
  • *
  • Posts: 29
Re: Gauge interest/solicit advice in Open Source Project: MySync
« Reply #3 on: March 24, 2021, 01:18:51 pm »
Hi Pluto,

This seams quite interesting.
May I ask:
  • What's your diff strategy for binary files?
    Is it different for text files?
  • If you don't have a diff strategy, is it only size, name, date?
  • If it's so like git, why not implement add and remove?
    These would add and remove files from being tracked, like git.
    Yeah I know you have ignore, but so does git and it still has a add/remove for tracking.
  • Can it work over an ssh connection?
    Like rsync can?
  • How much is it different from rsync?

Cheers,
Gus

Hi Gus!  Thanks for the response.  You give me some good points to ponder.

1 & 2. Answer:  If the size is different, then its different.  If either of the dates are different, then it performs a checksum to see if its different, and if not then it updates the dates only, and if so then its different.  Last, there is a -checksum flag that would cause it to check all files.  If the filename changes, then its treated as a different file (two actions: a delete and a create)... I suppose it would be easy enough to look for matching files between delete and create, to save some time copying.
3. Answer: Well, its only superficially like GIT... I didn't propose a commit process, and based on the only real-world use case (mine), the plan was to keep it simple.  If I make it too much like git, then the question becomes, why not just use git?... there are plugins for dealing with binaries.
4. Answer: If I make it to deal with just files, then you could do it using sshfs, which works great btw... But I should probably spend some time looking for a filesystem abstraction that handles ssh and sftp, etc.
5. Answer: As someone else pointed out, rsync is good for one-way synchronization (because it doesn't have the "last common" snapshot).

Thanks for reading and cheers!

Kurt

Pluto

  • New Member
  • *
  • Posts: 29
Re: Gauge interest/solicit advice in Open Source Project: MySync
« Reply #4 on: March 24, 2021, 01:27:08 pm »
It looks like a simpler Unison sync. It should handle two way sync, if i read the pdf well.

BTW:
rsync is only one way and can headless
unsion can two way and syncing between two servers to minimize the load on the connection and can headless

Interesting is the project if it can handle this like unison. And unison have found a lot of caveeats in sync between differnt systems.

This is why I love this place!!  I hadn't heard of unison.  I just looked at the user guide, and wow is that a rocket ship.  Tons of options.  I have a very simple use case... I want to perform a backup every day, onto a network drive (origin), but I don't want it to take 7 hours.  If I'm on another machine, I want to be able to make changes, either directly to the origin or by changing another local repository and mindlessly synchronizing those changes.

Do you see a use case for MySync?

Kind regards,

Kurt

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 501
  • Professional amateur ;-P
Re: Gauge interest/solicit advice in Open Source Project: MySync
« Reply #5 on: March 25, 2021, 02:37:27 am »
Hey Pluto,

Hi Gus!  Thanks for the response.  You give me some good points to ponder.

Quite welcome. I'm here to help :)

1 & 2. Answer:  If the size is different, then its different.  If either of the dates are different, then it performs a checksum to see if its different, and if not then it updates the dates only, and if so then its different.  Last, there is a -checksum flag that would cause it to check all files.  If the filename changes, then its treated as a different file (two actions: a delete and a create)... I suppose it would be easy enough to look for matching files between delete and create, to save some time copying.

Quite a solid method, I like it!!

3. Answer: Well, its only superficially like GIT... I didn't propose a commit process, and based on the only real-world use case (mine), the plan was to keep it simple.  If I make it too much like git, then the question becomes, why not just use git?... there are plugins for dealing with binaries.

When I suggested this my mind was: Maybe I wanna skip some files that can be added in the future and I don't want them lost in the ignore section.
Made sense at the time, dunno if it's a use case that can appeal.

4. Answer: If I make it to deal with just files, then you could do it using sshfs, which works great btw... But I should probably spend some time looking for a filesystem abstraction that handles ssh and sftp, etc.

When I asked about this it was kinda the lazy person in me: If you have a remote way of indicating it to MySync, then you skip the step of having you OS connecting a drive/path and then run MySync

5. Answer: As someone else pointed out, rsync is good for one-way synchronization (because it doesn't have the "last common" snapshot).

Thank you both @Pluto and @af0815 for the clarification on rsync!!

Cheers,
Gus
Lazarus 2.1.0(trunk) FPC 3.3.1(trunk) Ubuntu 20.10 64b Dark Theme
Lazarus 2.0.12(stable) FPC 3.2.0(stable) Ubuntu 20.10 64b Dark Theme
http://github.com/gcarreno

af0815

  • Hero Member
  • *****
  • Posts: 669
Re: Gauge interest/solicit advice in Open Source Project: MySync
« Reply #6 on: March 25, 2021, 06:51:37 am »
Do you see a use case for MySync?

Yes, there is more room for two way syncing programms in open source. A fresh program with new ideas working headless and with a gui will be very welcome. And unison is not userfriendly for starters, i have spend a lot of time with this program. This is one of the reason, why it have a secret life.

BTW: The sources of unison are ocaml (this is more or less a writeprotection SCNR), but if you read the code, you will find a lot of caveats in syncing files across systembounderies. The goal is to have stable hashes across the systems. 
regards
Andreas

BeniBela

  • Hero Member
  • *****
  • Posts: 788
    • homepage
Re: Gauge interest/solicit advice in Open Source Project: MySync
« Reply #7 on: April 12, 2021, 12:46:30 am »
I wanted to do such a sync project for a long time. Unfortunately, I have no time to do it

Unison synchronizes two drives/directories with each other

I would like to have a sync tool that can synchronize any number of drives and directories.

Practical example: I have two laptops, two external backup drives, and one PC in the office at work. The sync tool should make sure I have the necessary files on all of them.

Now you could call unison on any pair of them, but  then you have to call it manually and  end up with a dozen unison databases about the same files. My sync would have one database of files on each system, and then sync them automatically.

Now you could put all files in a cloud, but that needs to be always online. My sync would be  fully offline, and track which files were changed since the last sync. So it is actually offline. Like I do not go to the office because covid, so the PC is turned off. It only needs to sync when it is turned back on. And the backup drive I only use once a month.

Now you could put everything in version control, but that usually wants one version control for each project/folder, not one control that synchronizes entire drives on its own.

Also there should be a way to mark files for one-way syncing or only-for-this-system.
The work laptop I got from my employer has twice as much space as my own laptop. And my backup drive has more than twice as much space than both of them together. So the sync cannot actually sync everything. Big rarely used files should only be kept on the backup drive. Some private files should probably stay on my private laptop only, and some work files only on the work systems.

Anyways, there does not seem to be a specific reason to write this in Pascal


af0815

  • Hero Member
  • *****
  • Posts: 669
Re: Gauge interest/solicit advice in Open Source Project: MySync
« Reply #8 on: April 12, 2021, 09:21:36 am »
You must take care of the function sync and backup. It is IMHO not the same. Eg. you make a mistake in/with a file, sync will transfer this to any location. In backup you can go one version back and say, ok it was bad, but now fixed, because i have an old version.

Sync: like unison
Backup: Like rsync-backup (SVN/GIT?)

« Last Edit: April 12, 2021, 09:23:07 am by af0815 »
regards
Andreas

BeniBela

  • Hero Member
  • *****
  • Posts: 788
    • homepage
Re: Gauge interest/solicit advice in Open Source Project: MySync
« Reply #9 on: April 16, 2021, 12:24:56 am »
The backup is just a less frequent sync.

It is literally in the name, rsync

And rsync is one of the things I want to get rid of, because I sometimes break my backups by using the  wrong rsync options

 

TinyPortal © 2005-2018