Tracking user IDs in the crawler
We’ve just deployed a change to the crawler: we now store the user IDs of the users who crawled each URL in the index, along with a timestamp of when it was last crawled. Why track user IDs? Until now, we’ve required volunteers to go through a vetting process before they can crawl. Tracking user IDs lets us remove that requirement. The plan is that anyone can agree to the terms of service and generate their own API key to start crawling straight away. (API key generation isn’t implemented yet, but this change is the foundation for it.) ...