anti-spam measures #51

Open
opened 2 years ago by x1ddos · 12 comments
x1ddos commented 2 years ago
Owner

last night was a bit hot.

  • lots of spam/porn images, so i temporary disabled noxy proxy to always return 403 forbidden which resulted in broken image previews
  • i then added a display:none to css in https://git.qcode.ch/nostr/nostrweb/commit/9b11459f, to just temporary hide the broken images; pushed it to nostr.ch
  • checked back earlier this morning: the bots realized images aren't showing so they started to simply spam with text notes

the main problem here is with the public feed: new users who follow noone, or those who want to explore outside of their regular contacts list.

now, how to solve this long term? here's one idea:

  1. by default, populate "all events" public feed only with pow'ed events or those from expensive relays

  2. let users tweak this manually, i.e. opt-in. essentially, reduce or disable "pow requirement", and then of course relays management too so users can add non-expensive relays on their own risk.

@offbyn wdyt? feels like this is the top priority. last night the feed was basically unusable due to spam.

last night was a bit hot. - lots of spam/porn images, so i temporary disabled noxy proxy to always return 403 forbidden which resulted in broken image previews - i then added a `display:none` to css in https://git.qcode.ch/nostr/nostrweb/commit/9b11459f, to just temporary hide the broken images; pushed it to nostr.ch - checked back earlier this morning: the bots realized images aren't showing so they started to simply spam with text notes --- the main problem here is with the public feed: new users who follow noone, or those who want to explore outside of their regular contacts list. now, how to solve this long term? here's one idea: 1. by default, populate "all events" public feed only with [pow'ed events](https://github.com/nostr-protocol/nips/blob/master/13.md) or those from [expensive relays](https://github.com/fiatjaf/relayer/tree/master/expensive) 2. let users tweak this manually, i.e. opt-in. essentially, reduce or disable "pow requirement", and then of course relays management too so users can add non-expensive relays on their own risk. @offbyn wdyt? feels like this is the top priority. last night the feed was basically unusable due to spam.
x1ddos added the
feature
label 2 years ago
x1ddos added this to the mvp project 2 years ago
offbyn commented 2 years ago
Owner

Unfortunatelly I missed it this morning.

  • was it coming from a certain relay?
  • was the content similir or even same content repeated?
  • did they use different pubkeys?

long term ideas:

I like a bit of pow and think "expensive" relays should be explored but I am not 100% sure how they work.

also 100% for 2. this is a nice idea. could be configurated for both, sending and displaying?

some brainstorming:

nostrweb could postpone showing new events based on rules, for example: when a pubkey posts too much or if the same content is spammed over and over. this would only help clients that are already loaded but would still give a bad first impression.

nice thing about injecting the new spamy events later is at some point the timeline is still 100% "correct" again with all the spam, but that is mayb an hour in the past the the present should be useable.

alternative to showing later could be a small text: user seems to be spamming [show anyway]

nostrweb could maybe detect if a relay becomes too spammy and also throttle or offer option

is there a nip for muting or other kind of reporting? could have a rule that if a pubkey/content is muted too often the client decides to mute as well.

of corse this would be some kind of cat&mouse game. relays can do more, i.e. ratelimit by ip address?

Unfortunatelly I missed it this morning. - was it coming from a certain relay? - was the content similir or even same content repeated? - did they use different pubkeys? --- ### long term ideas: I like a bit of pow and think "expensive" relays should be explored but I am not 100% sure how they work. also 100% for 2. this is a nice idea. could be configurated for both, sending and displaying? ### some brainstorming: nostrweb could postpone showing new events based on rules, for example: when a pubkey posts too much or if the same content is spammed over and over. this would only help clients that are already loaded but would still give a bad first impression. nice thing about injecting the new spamy events later is at some point the timeline is still 100% "correct" again with all the spam, but that is mayb an hour in the past the the present should be useable. alternative to showing later could be a small text: `user seems to be spamming [show anyway]` nostrweb could maybe detect if a relay becomes too spammy and also throttle or offer option is there a nip for muting or other kind of reporting? could have a rule that if a pubkey/content is muted too often the client decides to mute as well. of corse this would be some kind of cat&mouse game. relays can do more, i.e. ratelimit by ip address?
offbyn commented 2 years ago
Owner

more exotic ideas using 3rd party modules

for images: the client could have a safe-for-work option and first check each image with https://www.npmjs.com/package/nsfwjs (selfhosted model to detect % of porn in an image) before displaying.

nostrweb could have an option to personalize and train what shuold be hidden http://naturalnode.github.io/natural/bayesian_classifier.html

could check for profanities
https://github.com/retextjs/retext-profanities (same ecosystem has a nice plugin for typing :emojis https://github.com/retextjs/retext-emoji)

more exotic ideas using 3rd party modules for images: the client could have a safe-for-work option and first check each image with https://www.npmjs.com/package/nsfwjs (selfhosted model to detect % of porn in an image) before displaying. nostrweb could have an option to personalize and train what shuold be hidden http://naturalnode.github.io/natural/bayesian_classifier.html could check for profanities https://github.com/retextjs/retext-profanities (same ecosystem has a nice plugin for typing :emojis https://github.com/retextjs/retext-emoji)
offbyn commented 2 years ago
Owner

instead of having everyones client running https://www.npmjs.com/package/nsfwjs we could run a bot that likes everything that it thinks is not porn. and option to follow this bot and only display what it likes.

instead of having everyones client running https://www.npmjs.com/package/nsfwjs we could run a bot that likes everything that it thinks is not porn. and option to follow this bot and only display what it likes.
offbyn commented 2 years ago
Owner

could have a local word-ban-list that is configurable.

maybe this wordlist could be published regularly so others (but also including spammers) could see pupular wordlists and merge into theirs.

could have a local word-ban-list that is configurable. maybe this wordlist could be published regularly so others (but also including spammers) could see pupular wordlists and merge into theirs.
offbyn commented 2 years ago
Owner

could only display images from followed users or liked events (ofc that requires following) but doesn't help with explore feed.

could only display images from followed users or liked events (ofc that requires following) but doesn't help with explore feed.
x1ddos commented 2 years ago
Poster
Owner

was it coming from a certain relay?

not really, from all relays i have seen; certainly all that are in src/main.js pool

was the content similir or even same content repeated?

yes, similar. actually, identical, but it seemed like they changed it every now and then

did they use different pubkeys?

yes, like with content: same pubkey continuously posts events, then they change the pubkey. actually, i've seen multiple pubkeys posting at almost the same time.


good brainstorming! to underline, there are two basic scenarios:

  1. a user follows noone, or a user searching for new pubkeys to follow in a dumpster fire "all events" feed like we have now on nostrweb.
  2. a home feed, containing events only from certain pubkeys a user already follows and nothing else.

the problem is clearly with the first scenario becase in the second, it is easy to "mute" - just unfollow that pubkey if it's too spammy. yes, you might want to only "partially" mute a pubkey, aka reduce number of displayed events, but sounds to me like a more advanced feature. can come back to it later. anyway, let's focus on the (1).


imho there are two categories of methods to deal with spam in a public "all events" dumpster fire feed:

  1. detection algorithms and heuristics, which in my head these belong to:
  • postpone showing new events based on rules: but which rules, who defines them and, most importantly, who would update them because they 100% will need an update from time to time
  • "user seems to be spamming [show anyway]" - but who defines what's "spam"? there are people who like posting a lot but i certainly wouldn't defined it as spam
  • detect if a relay becomes too spammy: but where to draw a line? a popular relay with many users can easly be taken for "spammy"
  • a rule that if a pubkey/content is muted too often the client decides to mute as well: sounds like a client cannot exist without a coordinating server/relay but we want clients to be independent from relays
  • ratelimit by ip address: doesn't really help against spammers with big resources; just sign up to aws, gcloud or azure and you have big ranges of ip addresses
  • selfhosted model to detect % of porn in an image: the problem is, today it's porn, tomorrow it's something else
  • local word-ban-list: who decides what's on the list, us? but i'd like to be able to say "fuck" once in a while; is it users themselves? then the banlist itself can be abused and we're back to the starting point
  • popular wordlists and merge into theirs: it's no better than mastodon or twitter moderation imho
  1. make it infeasible, too hard or too ineffective, too little impact from spam, so the spammers eventually give up.

obviously, (2) is ideal. i'd like to stick to it and implement something from (1) as a last resort.


also 100% for 2. this is a nice idea. could be configurated for both, sending and displaying?

yes, definitely for sending, too. if nostrweb required PoW'ed events to show up on a public feed, i think it's logical if it also offered users to mine their events so they show up in that same public feed.

is there a nip for muting or other kind of reporting?

only for public chats and it's a personal preference anyway; we want to improve a public "all events" dumpster feed which is the same for all new visitors.

we could run a bot that likes everything that it thinks is not porn. and option to follow this bot and only display what it likes.

fiatjaf actually suggested something similar. instead of focusing on a list of "boostrap" default relays, make a bootstrap list of some known "good" pubkeys and show events only from them.

> was it coming from a certain relay? not really, from all relays i have seen; certainly all that are in src/main.js pool > was the content similir or even same content repeated? yes, similar. actually, identical, but it seemed like they changed it every now and then > did they use different pubkeys? yes, like with content: same pubkey continuously posts events, then they change the pubkey. actually, i've seen multiple pubkeys posting at almost the same time. --- good brainstorming! to underline, there are two basic scenarios: 1. a user follows noone, or a user searching for new pubkeys to follow in a dumpster fire "all events" feed like we have now on nostrweb. 2. a home feed, containing events only from certain pubkeys a user already follows and nothing else. the problem is clearly with the first scenario becase in the second, it is easy to "mute" - just unfollow that pubkey if it's too spammy. yes, you might want to only "partially" mute a pubkey, aka reduce number of displayed events, but sounds to me like a more advanced feature. can come back to it later. anyway, let's focus on the (1). --- imho there are two categories of methods to deal with spam in a public "all events" dumpster fire feed: 1. detection algorithms and heuristics, which in my head these belong to: - postpone showing new events based on rules: but which rules, who defines them and, most importantly, who would update them because they 100% will need an update from time to time - "user seems to be spamming [show anyway]" - but who defines what's "spam"? there are people who like posting a lot but i certainly wouldn't defined it as spam - detect if a relay becomes too spammy: but where to draw a line? a popular relay with many users can easly be taken for "spammy" - a rule that if a pubkey/content is muted too often the client decides to mute as well: sounds like a client cannot exist without a coordinating server/relay but we want clients to be independent from relays - ratelimit by ip address: doesn't really help against spammers with big resources; just sign up to aws, gcloud or azure and you have big ranges of ip addresses - selfhosted model to detect % of porn in an image: the problem is, today it's porn, tomorrow it's something else - local word-ban-list: who decides what's on the list, us? but i'd like to be able to say "fuck" once in a while; is it users themselves? then the banlist itself can be abused and we're back to the starting point - popular wordlists and merge into theirs: it's no better than mastodon or twitter moderation imho 2. make it infeasible, too hard or too ineffective, too little impact from spam, so the spammers eventually give up. obviously, (2) is ideal. i'd like to stick to it and implement something from (1) as a last resort. --- > also 100% for 2. this is a nice idea. could be configurated for both, sending and displaying? yes, definitely for sending, too. if nostrweb required PoW'ed events to show up on a public feed, i think it's logical if it also offered users to mine their events so they show up in that same public feed. > is there a nip for muting or other kind of reporting? only for [public chats](https://github.com/nostr-protocol/nips/blob/master/28.md#kind-43-hide-message) and it's a personal preference anyway; we want to improve a public "all events" dumpster feed which is the same for all new visitors. > we could run a bot that likes everything that it thinks is not porn. and option to follow this bot and only display what it likes. fiatjaf actually suggested something similar. instead of focusing on a list of "boostrap" default relays, make a bootstrap list of some known "good" pubkeys and show events only from them.
offbyn commented 2 years ago
Owner

lol ofc I don't want it to be mastodon :)

"who decides" should aways be user's choice with good defaults. also it doesn't have to be delayed or hidden it could be a "sensitive content warning - show anyway" oneliner.

ratelimit by ip address: doesn't really help against spammers with big resources; just sign up to aws, gcloud or azure and you have big ranges of ip addresses

but would against small spammers, and event a little impact on a spammer with more resources.

I still think most effective is at the relay level, i.e. pow, ratelimiting or invite only relays.

I saw some spam today for the first time, I was so happy :) in this case the spammer used the same image link repeatedly, detecting and delaying this event or hiding behind "<N> times repeated message - show anyway" would have helped a lot. but ofc this is a bit a silly weak defense. such a spam-throttle could be setting to opt-out (on by default). ofc it's not a strong

lol ofc I don't want it to be mastodon :) "who decides" should aways be user's choice with good defaults. also it doesn't have to be delayed or hidden it could be a "sensitive content warning - show anyway" oneliner. > ratelimit by ip address: doesn't really help against spammers with big resources; just sign up to aws, gcloud or azure and you have big ranges of ip addresses but would against small spammers, and event a little impact on a spammer with more resources. I still think most effective is at the relay level, i.e. pow, ratelimiting or invite only relays. I saw some spam today for the first time, I was so happy :) in this case the spammer used the same image link repeatedly, detecting and delaying this event or hiding behind "\<N\> times repeated message - show anyway" would have helped a lot. but ofc this is a bit a silly weak defense. such a spam-throttle could be setting to opt-out (on by default). ofc it's not a strong
x1ddos commented 2 years ago
Poster
Owner

@offbyn here's my proposal. wdyt?

  1. set up an exprelay.nostr.ch, similar to fiatjaf's expensive relay but instead of generating lightning invoices, use keysend to reduce "too many invoices" abuse.
  • a user can whitelist their nostr pubkey on exprelay but they'll have to pay some small amount of sats, although large enough to make it cost-ineffective for spammers.
  • exprelay would accept events either from pay-to-whitelist pubkeys or pow'ed events/nip-13.
  1. implement nip-13 in nostrweb. users will have to mine an event if they want to post to exprelay, unless of course they paid to register (whitelist) their pubkey. they can still post events to other relays of their choosing regardless.

  2. nostrweb's public feed would then include the following and nothing else:

  • all events from exprelay.nostr.ch and expensive-relay.fiatjaf.com; i just don't know any other live "expensive" relays at the moment
  • pow'ed events from other relays - can keep the current default relay list with a filter similar to the example in nip-13
  • hand-picked pubkeys from https://nostr.directory and elsewhere, although not sure which keys to pick

and that is all for the public feed. users can then follow other pubkeys, post any kinds of events to other relays of their choice, do whatever they want, but it won't impact what's visible on the default public feed. then we can also re-enable image previews.

so, absolutely no content moderation. every user picks what they want to see in their "home feed" and they could use the public feed, hopefully clean at this point, as a search starting point.

@offbyn here's my proposal. wdyt? 1. set up an `exprelay.nostr.ch`, similar to fiatjaf's [expensive relay](https://github.com/fiatjaf/relayer/tree/master/expensive) but instead of generating lightning invoices, use [keysend](https://docs.lightning.engineering/lightning-network-tools/lnd/amp) to reduce "too many invoices" abuse. - a user can whitelist their nostr pubkey on **exprelay** but they'll have to pay some small amount of sats, although large enough to make it cost-ineffective for spammers. - **exprelay** would accept events either from pay-to-whitelist pubkeys or [pow'ed events/nip-13](https://github.com/nostr-protocol/nips/blob/master/13.md). 2. implement [nip-13](https://github.com/nostr-protocol/nips/blob/master/13.md) in nostrweb. users will have to mine an event if they want to post to **exprelay**, unless of course they paid to register (whitelist) their pubkey. they can still post events to other relays of their choosing regardless. 3. nostrweb's public feed would then include the following and nothing else: - all events from `exprelay.nostr.ch` and `expensive-relay.fiatjaf.com`; i just don't know any other live "expensive" relays at the moment - pow'ed events from other relays - can keep the current default relay list with a filter similar to the [example in nip-13](https://github.com/nostr-protocol/nips/blob/master/13.md#querying-relays-for-pow-notes) - hand-picked pubkeys from https://nostr.directory and elsewhere, although not sure which keys to pick and that is all for the public feed. users can then follow other pubkeys, post any kinds of events to other relays of their choice, do whatever they want, but it won't impact what's visible on the default public feed. then we can also re-enable image previews. so, absolutely no content moderation. every user picks what they want to see in their "home feed" and they could use the public feed, hopefully clean at this point, as a search starting point.
x1ddos commented 2 years ago
Poster
Owner

hiding behind "<N> times repeated message - show anyway" would have helped a lot

gmail used to have something similar years ago. spammers got smarter and started changing very slightly each message. some were successful even just by adding a few invisible space characters here and there.

> hiding behind "\<N> times repeated message - show anyway" would have helped a lot gmail used to have something similar years ago. spammers got smarter and started changing very slightly each message. some were successful even just by adding a few invisible space characters here and there.
offbyn commented 2 years ago
Owner

then we can also re-enable image previews.

true.

I generally like your proposal, conceptually I still think there is room for content based rules that are conifgurable.

another point, your proposal is for ignoring all other events right?

we always talk about droping the event completelly but it could also just be the deciding factor for when to use noxy. could be a nice first step to only show link preview and profile images form expensive relay or events with pow, or from vip/celebrity list :)

> then we can also re-enable image previews. true. I generally like your proposal, conceptually I still think there is room for content based rules that are conifgurable. another point, your proposal is for ignoring all other events right? we always talk about droping the event completelly but it could also just be the deciding factor for when to use noxy. could be a nice first step to only show link preview and profile images form expensive relay or events with pow, or from vip/celebrity list :)
x1ddos commented 2 years ago
Poster
Owner

to only show link preview and profile images form expensive relay or events with pow ...

yeah, for sure a good point. this has to happen on noxy server-side and "event validation" was the slowest part in the proxy. but we'll try anyway!

> to only show link preview and profile images form expensive relay or events with pow ... yeah, for sure a good point. this has to happen on noxy server-side and "event validation" was the slowest part in the proxy. but we'll try anyway!
offbyn commented 2 years ago
Owner

yeah, for sure a good point. this has to happen on noxy server-side and "event validation" was the slowest part in the proxy. but we'll try anyway!

nostrweb could also validate. it know if an event came from expensive relay or could maybe check if the pow proof checks out and depending on that not even initiate a noxy request. ofc in noxy you have to check too.

> yeah, for sure a good point. this has to happen on noxy server-side and "event validation" was the slowest part in the proxy. but we'll try anyway! nostrweb could also validate. it know if an event came from expensive relay or could maybe check if the pow proof checks out and depending on that not even initiate a noxy request. ofc in noxy you have to check too.
x1ddos added this to the mvp milestone 2 years ago
x1ddos added a new dependency 2 years ago
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: nostr/nostrweb#51
Loading…
There is no content yet.