-
Notifications
You must be signed in to change notification settings - Fork 38
Hiroyuki proposes 4chan Search Engine: Use FoolFuuka's Sphinx? #146
Description
Recently, Hiroyuki stated that he's thinking about a search engine on 4chan, in his Q&A thread. How about we recommend that he use FoolFuuka's Sphinx Search Engine (free, open source, all data stays on server)?
FoolFuuka's Implementation of the Sphinx Open Source Search Server has a interface that is familiar to 4chan users, has been battle tested on many archiver sites, and is proven to be powerful for sifting through piles of 4chan threads. Most of all, all data stays on the site.
It is 2015. Times have changed. There is no reason to have a third-party contractor (like Hottolink) implement a website's search engine when you can do it yourself for free.
Plan B: Hyperlink to Archives
Propose to let each board have a search hyperlink. This simply links to the corresponding Fuuka archiver as is, nothing changes. Dead thread URLs could also redirect as well.
Advertise this as making search and archive view possible "without any need to strain or overhaul 4chan, and with access to threads 2 years back or more".
Our actual implicit goal is that no actual search handling will done by 4chan.
If he accepts this (I think he is pretty darn eager to take any suggestion without thinking too deeply, he's usually wasted), then we are fine. If not, he gets a chance to testify with some kind of reason, and the 4chan userbase acts from there.
2channel + Hottolink
Hiroyuki recently stated the following in his Q&A, in response to allegations from 2ch users about data mining:
Oct. 2012: Entered into an exclusive commercial licensing agreement with Tokyo Plus Co., Ltd. and Mirai Kensaku Brazil, LLC, the operators of the 2channel site, for information posted on the 2channel site
In Japanese copyright law at that time, you can't upload contents without permission, even it is search engines.
hotlink Inc., provided custom search engines for clients for marketing purpose,
So, the company need 2ch permission to make the search engines.
And hot link is a public company in Japanese stock market.
If they are lying, you can get tons of money by suing them. Go ahead. Get rich. :)
Like this one, https://gnip.com/sources/twitter/
What hotlink., Inc. wants is publicly available text messages.
They don't want any personal information.
- Hiroyuki
Notice that he states that he licensed them (not sold) publicly viewable text metadata, which is the same data a search engine like Google crawls. This way, users could search the large text archives. This is the same data anyone can get in a web scrape or (in the case of 2ch) on the Internet Archive.
If I were an evil data mining corporation, I sure as hell don't need to ask anyone to scrape 4chan or use the Fuuka Archives.
