When learning a new language, I like to read existing code. https://soogle.org collection uses various free (GitHub, walk-specific sites) and paid ($40/month: https://serpapi.com/) sources to collect data. An expensive Anthropic opus-4-6 reviews sampled input and makes up instructions for cheap Anthropic claude-haiku-4-5 check of input to filter out stuff like "Smalltalk how to get a date" and "And Pascal is better than Smalltalk.". open source https://github.com/avwohl/soogle
When learning a new language, I like to read existing code. https://soogle.org collection uses various free (GitHub, walk-specific sites) and paid ($40/month: https://serpapi.com/) sources to collect data. An expensive Anthropic opus-4-6 reviews sampled input and makes up instructions for cheap Anthropic claude-haiku-4-5 check of input to filter out stuff like "Smalltalk how to get a date" and "And Pascal is better than Smalltalk.". open source https://github.com/avwohl/soogle