Welcome again to a different version on this collection on simply missed errors in machine studying workflows! For many who haven’t learn the primary one, that is a part of a collection that focuses predominantly on procedural errors that will not at all times be very apparent however have a really excessive potential of deteriorating mannequin efficiency in the event that they do find yourself slipping into our improvement pipeline.
Within the first article, we explored frequent pitfalls like misusing numerical identifiers, mishandling information splits, and overfitting the mannequin to uncommon characteristic values.
On this version, we’ll proceed to discover some errors associated to information dealing with, particularly specializing in the next two subjects:
- Coaching with information not accessible at prediction time
- Mixing magic numbers with actual numbers