Hey, i’m a software developer and i’m considering trying to build a site using ActivityPub, but i have a few concerns about it. My first concern is that if the platform is open source someone can host a malicious version of it, where certain requests may be ignored (such as deletion).
This leads into my next concern which is GDPR, because now i can’t be certain that a users data gets deleted upon their request and i’m not certain whether i would be liable since my instance federates with the malicious instance (which may also not be hosted in the EU which is itself problematic, and even if i’m not liable it’s still not great).
I considered if it was viable to make the platform invite based somehow, so that it doesn’t federate with everything by default, but that also sort of defeats the purpose of using ActivityPub.
The loss of control over content is also something that i don’t particularly like, since some people may use their own instance for harassment or something else gross, but i guess that wouldn’t be my problem since i just wrote the code and wouldn’t have anything to do with the hosting of such sites.
i’d appreciate any feedback since i think the technology and the fediverse is very interesting, i would definitely like to try it out, but i’m not sure how to go about these challenges.
ActivityPub is a standard, Lemmy, KBin & Mastodon are open source applications built on the standard. It’s the same relationship as Hypertext Transfer Protocol (HTTP) and Chrome, Safari, Firefox, Apache & IIS.
As a client/server architecture, Lemmy is no more or less vulnerable to malicious actors than a web browser or a web server. You’re at least as likely to have a rogue admin mishandle data as someone build Evil-Lemmy. While I consider myself a good netizen, if you delete this post right now I’m still going to have a copy for at least six months because that’s my current backup retention for this instance.
I’m no GDPR expert but I can’t see how an instance owner who does comply with GDPR can be punished for instances they don’t control not deleting federated data. There are ongoing conversations throughout the Fediverse on this topic.
My first concern is that if the platform is open source someone can host a malicious version of it, where certain requests may be ignored (such as deletion).
Just so you know, this is not a fefiverse specific issue. Third party websites have cropped up to scrape sites like Reddit and post archived versions of undeleted posts for decades. I’m not sure your concern relates to the fefiverse at all.
deleted by creator
I can only comment on the content part: if someone posts content that’s against your instance policy you can either block their instance or the user afaik
People have been asking the admins directly and never got an answer (AFAIK). Maybe on Mastodon it’s been discussed more thoroughly.
But anyway, when you’re federating, you’re only “sharing” what the user wishes to be public anyway.
You’re not federating their personal information (like email address), which is your responsibility when it comes to stuff like data protection and locality. But what your users post online is not your responsibility, as long as you take reasonable precautions against illegal activity etc.
It’s not unlike email and such - if a user sends an email from your service, they can request to delete it from your server, but it’s not up to you to delete it from recipient’s servers.
Considering EU likes to promote interoperability of services, I’d say they are aware of such limitations. Just make sure to make your service compliant, and make users know you have no power over other servers.
I’d assume it would be the malicious service that would be liable if anyone at all. Are you able to delete an email from another service after it’s sent under GDPR?
I can give an educated guess about GDPR:
Since the European Union have officially endorsed mastodon (social.network.europa.eu) as long as your instance complies with GDPR, you are not liable for actions taking by bad actor using ActivityPub to do bad actory things.
I am not sure about how that applies to data being sent to non EU servers as lack knowledge about GDPR.
IANAL, but the GDPR only concerns itself with personal data (name, address, email, IP etc.) for deletion requests. These however are not necessarily shared with other ActivityPub servers, so if you delete them of your own server it should be sufficient.
This leads into my next concern which is GDPR, because now i can’t be certain that a users data gets deleted upon their request and i’m not certain whether i would be liable since my instance federates with the malicious instance (which may also not be hosted in the EU which is itself problematic, and even if i’m not liable it’s still not great).
I’m not a lawyer, but I have done compliance work, but not for GPDR… so take with several grains of salt.
I’d be fairly surprised if other instances caching your data had any impact on your GPDR status (unless you wrongfully made that data public in the first place).
If WordPress.com hosts an intentionally public blog post for a user, and archive.org scrapes it and saves a copy, and the user deletes it from WordPress (which correctly handles the deletion), would GPDR hold WordPress liable for a different organization retaining a copy on a different server? It would surprise me if it did, I can’t imagine how anyone could be in compliance while hosting public content under any circumstances if that were so. ActivityPub is not exactly the same as this, as it automates the process of copying data to many servers. But so does RSS and that’s not new. If this were an issue, I think we’d have seen examples of it before now.
It’s more likely that each ActivityPub instance is a different service from GPDR’s perspective, and each instance needs the capability to delete content associated with a user upon request. But I believe deletes are already federated by default, so we’re only talking about malicious instances that deliberately ignore deletion requests. These would not be GPDR compliant, but I suspect that doesn’t reflect on your liability.
… which may also not be hosted in the EU which is itself problematic…
Data locality is an interesting question, but I’m again inclined to suspect that YOU are not hosting data outside the EU. Other instances are, and the liability for doing so is theirs not yours.
If you were concerned about this, you could do whitelist federation where you explicitly add instances in appropriate jurisdictions rather than Federating by default with a blacklist. The opportunity cost of doing this is, of course, cultural irrelevance. You’d be cutting yourself off from most of the physical and virtual world in order to achieve improved data locality.
The loss of control over content is also something that i don’t particularly like…
This is real but rather the point of federation. If you really don’t like it, then federation is not for you. But consider multiple perspectives:
- As a user of reddit or another centralized publishing platform, you already didn’t have control over your data. The hoster did, as did the untold millions who scraped it maliciously and silently. This does not compare favorably to the fediverse.
- As an admin of a traditional forum like PHPBB, you do give up control in the Fediverse. Though when you account for malicious scrapers, how much you give up is debatable.
- But as a user of that PHPBB forum, the fediverse gives you MORE control. If the admin of that non-federated forum throws a tantrum and shuts it down, the community and posts are lost. As a user in the Fediverse, federation allows users on other instances to retain their account identity, recover posts from caches, and re-establish their community elsewhere against the wishes of the previous hoster.
Federation does require the hoster to give up power, but more than equally increases the power of users in return. Like GPDR, federation aims at increasing the data autonomy of users, but rather than focusing on privacy and data destruction to facilitate a user who wants to take their toys and go home, it focuses on how users can continue to access their data usefully in the face of an admin who want to take their toys and go home. Although the means to achieve them are often in conflict… control over data destruction and control over data preservation are two sides of the same data-autonomy coin.