The Great Derpibooru Database Transfer

Lotus

Site Administrator
The coding necessary to import entire archives is now here. This is a thread for all issues concerning this process. Concerns include:
Should the favorites and upvotes score from derpibooru be transferred?
What should happen to images we already have uploaded here?
And more
Background Pony #1DB5
Regarding the "What should happen to images we already have uploaded here?",

I would propose to just merge the older duplicates into the new ones if possible. We need to get rid of one of them anyway and I see no immediate benefit to keeping them out of order.
I am working under the assumption that this approach is not significantly more work for the administrators than keeping the older ones. My understanding is that either way there needs to be a detection for duplicates and then someone needs to review them and take appropriate action. But I don't actually know how difficult either one is to implement. If it is more difficult to merge the older ones then a decision needs to be made of "is keeping the upload order from derpi worth the extra effort?".
ToastedTruffles

Early Adopter
I've mass-imported a few different tags and tag combinations from Derpibooru using the tool. When I checked how many images with the tags were on the site and compared this to the number of images I was trying to import, the numbers often didn't match up.

I think this means that many images may have been imported over to ponerpics without any tags attached to them. This guess is based on hearing that the ponerpics software has a way to double-check whether a newly-uploaded image is actually already on the server. I don't know whether this problem could be caused by the importer tool finding images on Derpibooru through it's tag search that also happen to have been deleted from the DB servers.

Is Ponerpics as a site aware of possible problems like this, and are there solutions for it on the way? Something like mass importing the archives and overwriting the copies of the images that the users have mass-imported on their own?

Changing the subject a bit to answer one of Lotus' questions, I think that for the purpose of mass importing large archives, they should overwrite the user's uploads and merge any tags that may have been added here. I'm not entirely sure about the favourites and upvotes score transferring over. They're valuable information, but I have some minor concerns with types of images that were downvoted or upvotes based on politics. I mean that the score of some images is likely impacted by how popular or disliked the subject matter of it is more than the actual quality of the drawing itself. With 50% being completely neutral on it, I'm 60% in favour of importing this metadata.
Lotus

Site Administrator
@Background Pony #031E
There is a duplication and merger system in place. The problem is that it often mistakes edits of an image for being the same image, so removing the discretionary aspect and making it automatic could result in images we don't want merged to be merged, while keeping it at moderator discretion would take too long. The current plan is to just not import and delete the current catalog, except for a few images tagged as not existing on derpibooru. There may be a way to preserve user favorites, but it is uncertain whether it will be possible to make it happen in a reasonable time frame.


@ToastedTruffles
I don't think it's a tagging issue. I think that it could be due to one of two reasons. First, the live site has an older version of Philomena that does not allow the uploading of images beyond a certain resolution. This is fixed in the new server, but here, a small percentage of images cannot be uploaded. The other likely cause is that most of the mass importers do not actually strip the full set of tags you want them to, but only the first 700.
ToastedTruffles

Early Adopter
I'm not trying to get the mass importers to strip the tags (aside from some of the "X in the comments" tags"), I'm trying to use them to transfer the images.

What I'm understanding so far from this is that there's going to be some sort of reset of the images currently on PonerPics, that this will happen along with some sort of server software update or server migration, and that it might be best to just back off for a few weeks until all this settles down. Mass importing via tools seems to be much less useful than I suspected it was.
Lotus

Site Administrator
@ToastedTruffles
I wouldn’t call mass importation tools unhelpful. Ponybooru and Twibooru rely on them in one form or another. We’ve just chosen to go a different route, and even that was uncertain for a long period of time because it was difficult to figure out the elixir code and elastisearch used by the database.
Lotus

Site Administrator
The server transfer is complete.

Ponerpics has completed it’s import of the archive – over 2150000 images, including 68,000 marked as deleted from derpibooru. However, this import only includes those images that were in the archive, and has a few gaps, like all of the images uploaded to derpibooru after July 8, 2020. We plan on filling all of those gaps with more imports in the near future, so the importation isn’t done. We also want to transfer a few key images from our old ponerpics server, but haven’t been able to yet. Maybe we can add those in the next few days, but maybe they will need to be uploaded again.

All of the imported images are tagged as “imported from derpibooru” and those deleted are marked as “deleted from derpibooru.” Images have their original image number from derpibooru, which means that links in the description work. They also have their original creation date and are sortable by that number. Ponerpics displays the derpibooru favorites, upvotes, downvotes, and score in parenthesis next to where these values normally appear. You can sort images by these values, and search with “derpi_upvotes” etc, and the same with combined scores. Tags have the same descriptions, images, and aliases as on derpibooru.

Not every piece of information was copied from the live ponerpics website because of limitations in testing ability and time. User avatars, moderation reports (including link and dnp requests), commissions, galleries, and private messages will have to be remade and reuploaded, as will all favorites, votes, and comments. Users who joined in the last few hours, after the information was transferred, will need to remake their accounts. Because of a discrepancy in how tags are numbered in the database, custom user filters will need to be remade.

You can still visit the current server at old.ponerpics.org until about August 30.

The import comes with some changes and updates, like a fix to file upload problems, and email password reset. We are missing a number of changes we would have liked to make, like new themes, fixing problems with the current default theme, mp4 support, and links to the derpibooru images below the source. We are hoping those can be added in the next few weeks.

Please tell us if there are any issues or if you have any requests for things you want us to try to transfer from the old server. We know there is an issue with svg uploads, and there are 6200 images that refuse to generate thumbnails. We hope to fix these issues quickly. The import set also includes many non-pony related and spam images that will need to be deleted. Please report these.

If you notice images that need to be deleted, or have requests for transfers, or concerns, please comment here.
Background Pony #2A53
where can the philomena source code be found including the custom modifications? According to the license of the philomena project, you are required to publish that including all custom changes, or you might loose the right to use the software until you ask every previous developer for permission again.

Might be easiest to just change the link in the footer, so people can easily find the source code repository.

PS: just offering the software on tape by mail (which one other altbooru does), does not fulfill the requirements according to fsfe.
Lotus

Site Administrator
@Background Pony #2A53
We know. We intend to make the fork available soon, but with limited development resources and so many issues we've only been able to do to one thing at a time. I don't believe our version is organized as a fork at this moment, so it's more complicated than changing a link.
Background Pony #E734
@Lotus
Are the future imports going to be more surgical or broad? I've got a small list of ids that don't appear to have made it, unless they're unavailable for other reasons.
Lotus

Site Administrator
@Background Pony #E734
Ideally, we'd search through the the set of images on derpibooru older than July 6, 2020, find out what we don't have, import those, then import images uploaded from July 6 to the present, and then set up and automatic importer.

Either way, it would be very helpful for you to post that list here whether directly or in a text document or paste bin, as it may help us locate problems.
Lotus

Site Administrator
@ToastedTruffles
Unfortunately we couldn’t import user avatars because of limited dev resources and the potential each new element had to create new errors and unforeseen problems.

You can access your account on the old server here:
old.ponerpics.org
ToastedTruffles

Early Adopter
I got a connection timeout error with the website host. I tried that URL earlier today, too, and got the same error.

For the time being, I roughly re-cropped my avatar, and made sure to save it to my computer this time.
Lotus

Site Administrator
@Barhandar
You can report them right here. I suspect what happened is that those images were uploaded on days where the Derpibooru scraper for the Rome archive did not run. As our current import is based on that archive, it has some gaps in it, most obviously it is missing all uploads after July 6, 2020. We are hoping to fill in the gaps with scrapes from Derpibooru.
BrokenInside™

Mouth Ready To Service
As for a different database question,

alias
lolilolicon
(or reverse)

alias or imply? (preferably imply, I suppose)
loli, lolicon, shotayoung
loliyoung
loliconyoung
shotayoung

implications:
straight shotayoung, straight, male, female, age difference, shota
foalconyoung
oppai loliloli

Not sure what else there is [aside from cub, but that's a mess because of the species/alternate meaning itself].
Lotus

Site Administrator
@BrokenInside™
Interesting. I’m not entirely sure why those images are missing, but I suspect those are more the archive missed.


@BrokenInside™
We could add those, sure. It also looks to me like the implications were not added on tags, though those probably existed in the archive.
Background Pony #B4B9
@Lotus
any news regarding publishing the source and linking to it, to keep any license issues from being an actual problem?
BrokenInside™

Mouth Ready To Service
@Lotus
Dislike that I'm several hours late, but the tag was supposed to be young, not younger. Forgot characters could be aged down to qualify for the latter tag in a safe manner.

Edit: At least I wasn't too late


Edit2: On that note, not sure what you'd want to do with the underage tag. Alias with young in some manner?
Also, it seems like the young tag is used pretty inconsistently.
Syntax quick reference: *bold* _italic_ [spoiler]hide text[/spoiler] @code@ +underline+ -strike- ^sup^ ~sub~