News:

Ignorance of scripture is ignorance of Christ. —St. Jerome

Main Menu

Mirror the ebook library.

Started by Geremia, April 20, 2021, 10:46:09 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Geremia

Here are all the files and their checksums, updated 5AM UTC Tuesdays:
https://isidore.co/CalibreLibrary/CalibreLibrary.SHA256SUMS (currently 2.6MB)
mirroring suggestion 🎩-tip to Luke below

rsync, too:
rsync://isidore.co/CalibreLibrary
rsync://isidore.co/Zotero

Luke

Thank you for the torrents!

The CalibreLibrary web seed https://isidore.co/CalibreLibrary seems to be different from the torrent though. For example, metadata_db_prefs_backup.json is 24386 bytes long on the webseed, but the torrent file has it at 23943 bytes. (With 32 megabyte blocks this automatically invalidates multiple books after that file.)

(For the record, here are the SHA256SUMs I got from fetching the web seed.)

I would guess this would make the Bittorrent seeds unavailable (if you did a re-hash in your torrent client, it should also complain about, say, metadata_db_prefs_backup.json being different). Or maybe I'm wrong and I should just re-download everything via Bittorrent protocol (and ignore the webseed), as it is seeded from a snapshot?

Thanks again for your library, it is fantastic to have this resource for the glory of God!

Geremia

Quote from: Luke on May 18, 2021, 10:56:06 AMCalibreLibrary web seed https://isidore.co/CalibreLibrary seems to be different from the torrent though
It is. It was created awhile ago, whereas CalibreLibrary is continually updated.

Luke

I see. To rephrase my question -- what's the best way to mirror CalibreLibrary? (And is it true that downloading CalibreLibrary.torrent would fail, as there is just one seed and the seed's backing store has changed in the meantime?)

For example, one could download the OPDS feed and files from there, but the OPDS does not include checksums. Maybe you could post checksums/file list of the CalibreLibrary directory? E.g. every week run something like: cd ~/CalibreLibrary ;  find . -type f -print0 | xargs -0 sha256sum > CalibreLibrary.SHA256SUMS ?

Geremia

Quote from: Luke on May 18, 2021, 02:15:54 PMAnd is it true that downloading CalibreLibrary.torrent would fail, as there is just one seed and the seed's backing store has changed in the meantime?
True. It's not the best way of doing it. The IPFS node isn't dynamically updated, either.

Quote from: Luke on May 18, 2021, 02:15:54 PMcd ~/CalibreLibrary ;  find . -type f -print0 | xargs -0 sha256sum > CalibreLibrary.SHA256SUMS
Thanks for the suggestion.
I like GNU Parallel; I ran:
parallel -0 sha256sum :::: <(find . -type f  -print0) | pv -Wbr > CalibreLibrary.SHA256SUMS
I put in a cronjob to run it weekly, too.

Luke

Thank you, this is great! I just ran a sha256sum --check run and I have a complete copy :-) I will also create a Tor Hidden Service for my mirror and post a link here when it is done. (Plus, I learned that pv takes -W argument, which is an added bonus  :) )

Geremia

#6
Quote from: Luke on May 18, 2021, 08:46:26 PMsha256sum --check run and I have a complete copy :-)
Complete copy of what? You can't possibly have downloaded the whole library already; the server's upload speed is only about 2 Mbps, and the library is 95GB, so it would take at least 4 days.
Or did you compare it to the torrent's checksums?

Luke

#7
Yes, a complete copy of the library! I mirrored books from isidore.co's OPDS feed back in the 2020, so now just had to re-download the latest additions.

$ du -hs CalibreLibrary
95G CalibreLibrary

Geremia

#8
Quote from: Luke on May 19, 2021, 08:25:19 AMYes, a complete copy of the library!
Deo gratias!
If you give me the 🧅 address of your mirror, I'll post it here.
Here's Zotero.SHA256SUMS, too.

Maybe you'd be interesting in seeding to LibGen:
Quote from: palisadehealer on April 17, 2021, 09:31:13 AMCould you please upload the collection to Library Genesis as well? I'm sure it would do good to so many, especially the works of the Saints.
For your convenience, I've looked up the details of uploading and present them here:
LINK: http://librarian.libgen.lc/librarian/
USERNAME: genesis
PASSWORD: upload
I do hope you would entertain my humble request... I can only imagine the great profit souls would have, whether they seek out these Catholic works or merely chance upon them in LibGen.
I've been uploading to Z-Library the past few weeks (very slow).

Geremia

#9
Quote from: Luke on May 19, 2021, 08:25:19 AMjust had to re-download the latest additions
Can you suggest a user-friendly method of downloading the entire library using the checksum file? My friend, a Windows user who's not experienced with the command-line, would also like to mirror the library.

Perhaps rsync would be best:
rsync -ah rsync://isidore.co/CalibreLibrary CalibreLibraryrsync -ah rsync://isidore.co/Zotero Zotero

Strider3000

Is the rsync server up and listening? I receive connection errors.

Geremia

Quote from: Strider3000 on August 18, 2021, 06:50:40 AMIs the rsync server up and listening? I receive connection errors.
Try it now.

Strider3000

Quote from: Geremia on August 18, 2021, 10:52:05 AM
Quote from: Strider3000 on August 18, 2021, 06:50:40 AMIs the rsync server up and listening? I receive connection errors.
Try it now.

Thanks Geremia!
I'm at about 24GB currently. The site keeps going down - I'm concerned my rsync may be causing issues. Do you need me to stay at a certain constant bandwidth or schedule?

Geremia

#13
Quote from: Strider3000 on August 22, 2021, 05:51:59 PMThe site keeps going down
I was trying to add another graphics card; that's why it was down.
Upload rate of my ISP is really slow, so it'll probably take >4 days or so to completely sync the whole library.

Geremia

#14
A user is generously hosting a faster mirror of the e-book library. It is a gigabit connection.

Kephapaulos

What does it mean to mirror in this case? To copy exactly as something is online?

I don't understand how to mirror the library? How does one do that?

Geremia

Quote from: Kephapaulos on September 06, 2022, 06:52:54 PMWhat does it mean to mirror in this case? To copy exactly as something is online?

I don't understand how to mirror the library? How does one do that?
Run this rsync command:
Quote from: Geremia on May 19, 2021, 01:12:49 PMrsync -ah rsync://isidore.co/CalibreLibrary CalibreLibrary
https://isidore.colin.coffee is a faster mirror, but currently rsync or webseed support isn't setup on it.

Pali

#17
Hi Geremia, I contacted the admin [on Twitter] of Anna's Archive (https://en.wikipedia.org/wiki/Anna%27s_Archive), a repository/aggregator that backs up book collections to preserve knowledge, and I recommended that she also support your excellent Isidore Calibre collection. She's willing!

She offers to set up an SFTP server for you to upload to.

I hope you find this a good opportunity. It looks to be a great aid in bringing these excellent Catholic resources to a wider audience (even bigger than LibGen, whom they also back up). And of course, thank you so much for your work, Geremia; readers here can't thank you enough 🙏

Geremia

Quote from: Pali on July 08, 2023, 11:34:26 PMShe offers to set up an SFTP server for you to upload to.
I'd never heard of Anna's Archive. I contacted her.
Grazie.

Geremia

The various archives are now on the Interplentary File System (IPFS). See the "IPFS" links in the banner above.