Hi,
our plan with the bibliography collection of GNUnet is to implement something similar to your/freehaven's anonbib.
While running the build and cache update of it from current git HEAD on the anonbib.cfg I noticed a number of outdated and broken links.
I'm currently playing with 2 options: re-use anonbib as it is and change the style + some of its content (for us at GNUnet) or write something similar to it. From my perspective option 1 would be the best as we could work on fixing links together, keep the content up-to-date and at the same time keep the duplicate efforts and work down to a minimum.
What do you think?
At some time in the past I noticed that the anonbib did not have links to local copies of some of the materials. If that's still the case, I'd definitely suggest creating them at this oppurtunity. And though rare and more curation work, some papers do receive content / errata updates.
On Fri, Nov 3, 2017 at 5:58 PM, ng0 ng0@infotropique.org wrote:
Hi,
our plan with the bibliography collection of GNUnet is to implement something similar to your/freehaven's anonbib.
While running the build and cache update of it from current git HEAD on the anonbib.cfg I noticed a number of outdated and broken links.
I'm currently playing with 2 options: re-use anonbib as it is and change the style + some of its content (for us at GNUnet) or write something similar to it. From my perspective option 1 would be the best as we could work on fixing links together, keep the content up-to-date and at the same time keep the duplicate efforts and work down to a minimum.
What do you think?
Hi! I'd love to have more people working on the anonbib content. The code itself is an old yucky kludge to which I feel no strong attachment, and the generated HTML is also in need of a revamp.
So, "patches welcome"!
On Fri, Nov 03, 2017 at 09:58:35PM +0000, ng0 wrote:
our plan with the bibliography collection of GNUnet is to implement something similar to your/freehaven's anonbib.
Great.
See also the censorbib, for another example.
While running the build and cache update of it from current git HEAD on the anonbib.cfg I noticed a number of outdated and broken links.
Yep. Many links have failed over the years. That was one of the big reasons to have the local cached version of each file.
I'm currently playing with 2 options: re-use anonbib as it is and change the style + some of its content (for us at GNUnet) or write something similar to it. From my perspective option 1 would be the best as we could work on fixing links together, keep the content up-to-date and at the same time keep the duplicate efforts and work down to a minimum.
Sounds plausible to me. I think we would be excited to take patches for broken links -- even if the new link becomes just a link to our cached version, which will hopefully live forever. :) https://www.freehaven.net/anonbib/cache/
But for the ones that have a broken link *and* don't have a cached version, it would be especially awesome for somebody to track those down.
It's not entirely clear what we ought to do with anonbib. At the beginning, there was no google scholar, so it really was the place to go to find out about papers in the anonymous communications area. And also, back then, there were only 10 or 15 papers and you could feasibly read all of them.
Now I think anonbib needs to be something other than "all of the papers about the topic". One way forward would be to cull it even more, so it becomes more of a recommended reading list.
--Roger
On 04.11.2017 19:54, Roger Dingledine wrote:
our plan with the bibliography collection of GNUnet is to implement something similar to your/freehaven's anonbib.
Great. See also the censorbib, for another example.
There's also a mixnet bibliography at https://bib.mixnetworks.org/ / https://github.com/applied-mixnetworks/mixbib . If you come across papers related to mixnets, please submit a patch! Also, we should add highlights like the anonbib has.
Roger Dingledine transcribed 2.5K bytes:
On Fri, Nov 03, 2017 at 09:58:35PM +0000, ng0 wrote:
our plan with the bibliography collection of GNUnet is to implement something similar to your/freehaven's anonbib.
Great.
See also the censorbib, for another example.
Thanks, I'll search for it.
While running the build and cache update of it from current git HEAD on the anonbib.cfg I noticed a number of outdated and broken links.
Yep. Many links have failed over the years. That was one of the big reasons to have the local cached version of each file.
I'm currently playing with 2 options: re-use anonbib as it is and change the style + some of its content (for us at GNUnet) or write something similar to it. From my perspective option 1 would be the best as we could work on fixing links together, keep the content up-to-date and at the same time keep the duplicate efforts and work down to a minimum.
Sounds plausible to me. I think we would be excited to take patches for broken links -- even if the new link becomes just a link to our cached version, which will hopefully live forever. :) https://www.freehaven.net/anonbib/cache/
But for the ones that have a broken link *and* don't have a cached version, it would be especially awesome for somebody to track those down.
It's not entirely clear what we ought to do with anonbib. At the beginning, there was no google scholar, so it really was the place to go to find out about papers in the anonymous communications area. And also, back then, there were only 10 or 15 papers and you could feasibly read all of them.
Now I think anonbib needs to be something other than "all of the papers about the topic". One way forward would be to cull it even more, so it becomes more of a recommended reading list.
--Roger
Christian Grothoff and myself have a different understanding of how we would apply anonbib to our work, but essentially we would have 2 different "flavors". Anonbib has a specific focus (I assume, I didn't go through all the papers yet) and our paper selection would be more focused on another topic. Christian's idea is that we'd have two different topics hosted. We are discussing this here right now: https://gnunet.org/bugs/view.php?id=5121
To quote:
Working together on the anonbib code: great. Just to clarify: we would host _our_ bibligraphy and they'd continue to host theirs, right? Because the focus (secure P2P vs. anonymity) is somewhat different, so it does make sense to have two different sites with different papers.
Now the "problem" is neither our bibliography nor yours seem to be completely "ours" or "yours", we mix in what we picked up on the way to where we are now. Our bibliography.git export right now counts 1045 files.
I agree with you, to trim them down could be necessary. For example we could concentrate on creating selected volumes of papers and the cross-links between them, and stay within a chosen topic.
I have no idea (at the moment) what has been collected on our side and how many of the files are outside of a common theme, I only did the export to git recently.
I'd rather not let people depend on Google's infrastructure for knowledge, but it shouldn't be out job to maintain a complete and growing library of knowledge either, so picking a topic and cutting down to that sounds reasonable to me.
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Sat, Nov 4, 2017 at 2:54 PM, Roger Dingledine arma@mit.edu wrote:
On Fri, Nov 03, 2017 at 09:58:35PM +0000, ng0 wrote:
our plan with the bibliography collection of GNUnet is to
https://gnunet.org/bibliography
implement something similar to your/freehaven's anonbib.
https://www.freehaven.net/anonbib/ https://www.onion-router.net/Publications.html
See also the censorbib, for another example.
# I2Pbib https://geti2p.net/en/papers/
There a a few more I can't recall right now. If anyone knows of other community curated collections in the overlay routing mixnet messaging p2p privacy crypto comms distributed filesharing storage spaces... feel free to post links to them in this subthread.
It's not entirely clear what we ought to do with anonbib. At the beginning, there was no google scholar, so it really was the place to go to find out about papers in the anonymous communications area. And also, back then, there were only 10 or 15 papers and you could feasibly read all of them.
Yes there are lots of papers all over the net, and in massive collections like arxiv, SSRN, etc... but few places collected and curated by community of relavance here. One could envision a large community curation database bibliography of papers and multimedia presentations. With tagging and export system, including perhaps json / HTML, pick and choose your fields, for those papers that each official project tags as references or relavant to their interests. Click 'Tor', get Tor's... click 'Briar', get Briar's... etc and in addition to the obvious function of global search and browse everything in the databse by various sorting / filters / rankings that the reader chooses. With open submission by anyone (ie: as yet non project submitted / tagged bibinfos), of new entries into a 'potentially relavant to community' subpool, such that they might eventually be tagged by projects and readers as desired.
Saves a lot of duplicative work at the projects, is easily mirrored, imported into web pages, etc.
Now I think anonbib needs to be something other than "all of the papers about the topic". One way forward would be to cull it even more, so it becomes more of a recommended reading list.
Saves a lot of duplicative work at the projects, is easily mirrored, imported into web pages, etc.
With mentioned problems of - Google threat covered by community hosting and replication. - Separate / overlapping project / topic focus covered by a flexible tagging and views system. - Not easily being able to find and read what other projects in the space are referencing covered by now having a combined database itself. - Maintaining effort of growing multiple bib systems covered by everyone lending some minor time coding to the main bib project db itself, freeing up time for each project to then focus on submit / tag and reading / using the materials as the more beneficial result.
And so on.
With mentioned problems of
- Broken links, not founds, redirects covered by a single monthly crawl thus being regular and benefitting all projects at once. - Size, could apply common compression such as xz or even ZSTD to entire mirrorable local archive. Similar for video materials.
http://open-zfs.org/w/images/b/b3/03-OpenZFS_2017_-_ZStandard_in_ZFS.pdf https://www.youtube.com/watch?v=hWnWEitDPlM
I wonder if there is an option to start to use ipfs ( https://ipfs.io/ ) or something like it to permanently and resiliently store items for posterity?
On Nov 4, 2017, at 7:01 PM, grarpamp grarpamp@gmail.com wrote:
With mentioned problems of
- Broken links, not founds, redirects covered by a single monthly
crawl thus being regular and benefitting all projects at once.
- Size, could apply common compression such as xz or even ZSTD
to entire mirrorable local archive. Similar for video materials.
http://open-zfs.org/w/images/b/b3/03-OpenZFS_2017_-_ZStandard_in_ZFS.pdf https://www.youtube.com/watch?v=hWnWEitDPlM _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Sat, Nov 4, 2017 at 8:05 PM, Scfith Riseup scfith@riseup.net wrote:
I wonder if there is an option to start to use ipfs ( https://ipfs.io/ ) or something like it to permanently and resiliently store items for posterity?
Bib users would need a client to avoid abusing inproxy. Though a client would offload from the bib.
There doesn't seem to be much of a data loss issue now, papers with broken links are still refindable and fixable if searched for hard enough, no?
But it might be said there's organization, maintenance, and wider audience utility issues with current bibs.
However - Once a better bib gets made, someone should consider pushing the dataset into IPFS, gnunet, storj, whatever. Object hash deduplicated systems among them are storage efficient, no matter how many people push the same thing. - Since most video presentation data exists only on youtube (aka: google) at their whim, I assign high risk of loss to that community corpus. It's a mess. All projects should be publishing local copies of theirs for mirroring. Also, it's hard to autodedupe down from youtube since they embed uniques per download / view. - Projects should self host, or at least dual home themselves, in their own overlays. for reference and other uses.
grarpamp transcribed 1.9K bytes:
On Sat, Nov 4, 2017 at 8:05 PM, Scfith Riseup scfith@riseup.net wrote:
I wonder if there is an option to start to use ipfs ( https://ipfs.io/ ) or something like it to permanently and resiliently store items for posterity?
Bib users would need a client to avoid abusing inproxy. Though a client would offload from the bib.
There doesn't seem to be much of a data loss issue now, papers with broken links are still refindable and fixable if searched for hard enough, no?
But it might be said there's organization, maintenance, and wider audience utility issues with current bibs.
However
- Once a better bib gets made, someone should consider
pushing the dataset into IPFS, gnunet, storj, whatever. Object hash deduplicated systems among them are storage efficient, no matter how many people push the same thing.
- Since most video presentation data exists only on youtube
(aka: google) at their whim, I assign high risk of loss to that community corpus. It's a mess. All projects should be publishing local copies of theirs for mirroring. Also, it's hard to autodedupe down from youtube since they embed uniques per download / view.
- Projects should self host, or at least dual home themselves,
in their own overlays. for reference and other uses.
Good morning,
I like the proposed ideas so far (especially the idea of being able to filter by tags and keeping one code repository that could be reused at each others location. We could try and use http://libgen.io/ and https://sci-hub.cc/ as a fallback search if there's an generic API for them (I haven't tried so far), I heard they are good although sometimes (they might be?) legally in the grey depending on where you are located.
I think videos should be a separate issue, we selfhost them already as far as I know but integrating them into git is no (good) solution. If you don't go for something like Mediagoblin, you could ask the higher level organization you are part of (for example GNU, in our case) if video/audio hosting capabilities exist. Asking CCC for hosting would be another choice, for their media they have a good amount of mirrors. In the longterm this should be replaced, but for now this is good enough. However, this is derailing a bit from the original issue.
You listed some bibs that are similar to the ones already mentioned and proto-bibs (like ours at GNUnet). Should we track down more of them to ask the groups and people running them if they want to get involved? Or do you want to get started?
I'll need the feedback of Grothoff before I can say wether we as a group agree or not. My opinion is that it's good and reusable at our side without causing too much confusion about content and location.
On Mon, Nov 6, 2017 at 12:56 AM, ng0 ng0@infotropique.org wrote:
I think videos should be a separate issue, we selfhost them already as far as I know but integrating them into git is no (good) solution.
Don't think I would propose committing the actual videos / papers to git... too much bloat... just the bib / meta / hash info and links. Perhaps the links would point to files on the joint webserver. Mirrors could clone the git and rsync the files. Primary video links could be out to youtube. Secondary sets of links that require clients could go to IPFS or wherever for both papers and videos, even torrent magnet infohash, seeding bandwidth could be shared across projects as well.
If you don't go for something like Mediagoblin
exist. Asking CCC for hosting would be another choice, for their media they have a good amount of mirrors.
Whatever works.
Should we track down more of them to ask the groups and people running them if they want to get involved?
If in the crypto privacy messaging overlay etc etc etc spaces, it could be beneficial to at least send them a link to this thread. Since each can freely tag to their own desire / view, and saves maintenance it could be a hit.