Hot take: I was sure fine-tuning a model from scratch was the only way to get good results for my project.

For a small text sorting tool I'm making, I spent two weeks trying to fine-tune a base GPT model with my own data. It was slow and the results were messy. Then a friend said to just use prompt engineering with GPT-4's API instead. I switched last Friday, wrote a better system prompt with clear examples, and got it working perfectly in about three hours. The API call was way simpler and cheaper. Has anyone else moved from custom training to better prompting and been shocked by the difference?

3 comments

3 Comments

torres.pat4mo ago

Wow, but doesn't that just depend on what you're trying to do?

the_jesse4mo ago

Yeah, "fine-tuning from scratch" is almost never the right first step.

matthewh283mo ago

Seriously, why would you start from zero? I tried that once for a customer service bot. Spent weeks and a ton of money on compute just to get it to where a good base model already was. Starting with a strong pre-trained model and then tweaking it is the only way that makes sense.